Blog

Costs and benefits of erasure coding

Durability is what separates a data storage system from simply using an external hard drive, a USB stick or a smartphone. When one of those simpler storage solutions fails, it’s safe to say that the data is toast. When we don’t want the data to be toast when things drop out of our pockets, we back them up to a more secure place like “the cloud”. We do that because we cannot reliably stop things falling out of our pockets no matter what we do.

... more

Configuring gRPC retries

I never understood the difference between the service oriented architecture and the micro services really. Yeah SOAP is archaic now and JSON saved us all, but in fact, they all look like applications making requests and expecting responses over the network. And similarities don’t end there. Like what to do when the application, a) doesn’t receive a response, or b) receives a response it doesn’t expect.

... more

Large scale hardware testing

We all have a few servers in our offices, or a colocation provider or in a part of a data center. And a few more every year is not much of a problem. Unpack, deploy, plug, install, reboot and that’s it. Still a few more steps than what you’d need in a cloud environment but nothing an experienced technician can’t handle. Things change when numbers are around some hundreds though.

... more

Distributed ceph monitoring

Like any distributed system, reliability of ceph operations largely depends on the available monitoring. Over the years we have seen and deployed many solutions ranging from zabbix to opentsdb to others for the same purpose. Nowadays our general choice is around Prometheus and that goes for ceph monitoring as well.

... more

A login page with vuex and vuetify

In the early stages of one of our newer projects, we implemented the login screen and the required isAuthenticated control by using local storage directly. Since then we adopted vuetify and had to redo the screen so why not take this opportunity to learn something new in the process. Enter vuex.

... more

Keeping IPs alive without keepalived

We all have a service that we run with multiple instances of the same application, to keep it available even when one of them goes down. When we do that, we usually deploy a reverse proxy (or a load balancer) to direct the users of this service to the instances.

... more

A journey through the write caches

As is customary for any hardware, we had to test our new drives in one of our latest deployments, especially how they are going to behave w.r.t. write caching. This specific deployment had RAID controllers set in JBOD mode, exposing all drives directly to the operating system, which we use as bluestore ceph osds.

... more

Heaviness of large ceph clusters

Data capacity is the first thing that comes to mind while talking about large ceph clusters, or data storage systems in general. The number of drives is another measure to think about. And sometimes maximum iops is something to look out for, especially while considering a full-flash / nvme cluster. But heaviness? What does that even mean?

... more

Simple networking for bare-metal kubernetes

One of the most important and also the least understood part of kubernetes is cluster networking. I guess you already read through that and now you’re a bit confused about how to implement the required model, considering you were presented with about ~30 different options. A bit more googling would reduce that number to a few more popular choices. Nevertheless container networking is an involved subject and it’s easy to get lost in the details of any specific solution.

... more