An Introduction to Prometheus and Grafana

We earn commission when you buy through affiliate links.

This does not influence our reviews or recommendations.Learn more.

Prometheus is an open-source, metrics-based monitoring system.

prometheus

It collects data from services and hosts by sending HTTP requests on metrics endpoints.

It then stores the results in a time-series database and makes it available for analysis and alerting.

Why monitor?

prometheus architecture

Monitoring usually relates to events.

Monitoring a system could include profiling, logging, tracing, metrics, alerting & visualization.

This can be quite limiting.

grafana node exporter

Whitebox monitoring means monitoring the internals of a service.

It would expose data about the state and performance of the internal components.

It is important to track not just successful requests but also failed ones.

#2.Traffic

A measure of how much demand is being placed on your system.

For a web service, this is usually HTTP requests per second.

#3.Errors

The rate of requests that fail.

#4.Saturation

How full your service is.

Latency increase is often an important indicator of saturation.

Many systems degrade in performance much before they achieve 100% utilization.

It can never decrease, but it can be reset to zero.

So, if a scrape fails, it only means a missed datapoint.

The cumulative increase would be available at the next read.

Examples:

#2.Gauge

A gauge is a snapshot at any given point in time.

It can both increase or decrease.

#3.Histogram

A histogram samples observations and counts them in configurable buckets.

They are used for things like request duration or response sizes.

For example, you could measure request duration for a specific HTTP request.

The histogram will have a set of buckets, say 1ms, 10 ms, and 25ms.

#4.Summary

Similar to histogram samples observations, typically request durations or response sizes.

For example, in one minute, you had three requests that took 2,3,4 seconds.

The sum would be 9, and count would be 3.

The latency would be 3 seconds.

Scraping

Prometheusis a pull-based system.

To fetch metrics, Prometheus sends an HTTP request called a scrape.

It sends scrapes to targets based on its configuration.

There are moretime-series databasesfor monitoring solutions you may want to explore.

Client libraries

To monitor a service, you oughta add instrumentation to your code.

There are client libraries available for all popular languages and runtimes.

Using these libraries, once you add a few lines of code, your code can start emitting metrics.

This is called direct instrumentation.

These libraries enable you to define internal metrics and also expose them via an HTTP endpoint.

When Prometheus scrapes the metrics HTTP endpoint, the client library sends the metrics to the server.

Official client libraries are offered by Prometheus for Go, Java, Python, and Ruby.

Prometheus has an open ecosystem.

Exporters

Many applications expose metrics in non-Prometheus format.

For example, MySQL, Kafka, JMX, HAProxy, and NGINX server.

In these scenarios, you make use ofexporters.

An exporter is a tool you deploy along with the program you want metrics from.

An exporter acts like a proxy between the app and Prometheus.

This can be done using static configuration.

In the case of dynamic environments, this cannot be done; therefore service discovery is used.

Alerting

Alerting with Prometheus consists of two parts

Alerting rules send alerts to the Alertmanager.

The Alertmanager then manages those alerts.

It sends out notifications using many out-of-the-box integrations like email, Slack, Hipchat, and PagerDuty.

The Alertmanager can also perform silencing or aggregation to reduce the number of notifications.

Here is theguideto monitoring the Linux server using Prometheus and Dashboard.

Grafana fully integrates with Prometheus and can produce a wide variety of dashboards.

You will need to configure Prometheus as the data source for Grafana.

Summary

Prometheus has very few requirements.

It can be quite simple to run as it is a single binary with a configuration file.

It can handle thousands of targets and ingest millions of samples per second.

Prometheus is designed to track the overall system, health, behavior of the system.

Grafana is the best tool available for visualization of metrics and integrates seamlessly withPrometheus.

Why monitor?#

#2.Traffic#

#3.Errors#

#4.Saturation#

#2.Gauge#

#3.Histogram#

#4.Summary#

Scraping#

Client libraries#

Exporters#

Alerting#

Summary#