Prometheus

“Prometheus is a systems monitoring and alerting toolkit”.

Everyone starts monitoring with sshing to nodes and running basic tools, and everyone knows it gets tiring real fast. Any system with more than a few components will require a monitoring system. Prometheus and the tools in the ecosystem provide an easy to run, easy to use solution.

While more established tools like zabbix and cacti are pretty solid, and older tools like nagios and icinga are still useful; Prometheus - designed with the ideas from running data centers at Google scale, has a lot to offer. We started using it well before the first stable release for the sheer power of its aggregation capabilities. Then took it to cloud environments where it helped cut the alerting mess of nagios, back to sane levels. And of course to Kubernetes where it really shines: run a single container, let it discover resources and start monitoring. No external databases to maintain, no extra collectors to deploy.

And who doesn’t love visualized metrics, especially if they’re summarized in coherent dashboards? Everyone’s favorite fabulous grafana includes built-in support for Prometheus almost from the days of its inception. Actually we’ve been using grafana for longer, preparing dashboards for opentsdb, elasticsearch and even our custom data sources, but the experience with Prometheus especially in the latest releases is unmatched.