Metrics Setup in Rust

Metrics provide insights into the system's general performance and specific functionalities. They will also help monitor performance and health.

Effective system monitoring and optimization require detailed metrics. This article will teach you how to use metrics in your Rust application to enhance observability, identify and address performance bottlenecks and security issues, and optimize overall efficiency.

Metrics Types

Counter
- A cumulative metric that represents a monotonically increasing value. Usually combined with other functions to give a value per X unit of time.
- Counters help measure an event's total number of occurrences such as the number of requests processed.
Gauge
- A metric representing a single numerical value that can go up or down
- Gauges are suitable for measuring fluctuating values, such as the number of requests processed.
Histogram
- A metric that samples observations and counts them in configurable buckets.
- Histograms help understand the distribution of values, like response times, allowing you to analyze performance across different percentiles.

Tooling for Metric Visualization: Grafana and Prometheus

While collecting metrics is crucial, visualizing and analyzing them is equally important. In Rust ecosystem, two popular tools, Grafana and Prometheus, stand out for their robust metric visualization capabilities:

Prometheus: A leading open-source monitoring solution, Prometheus excels at collecting, storing, and querying metric data. With its powerful query language, PromQL, and scalable architecture, Prometheus is well-suited for monitoring modern, dynamic environments.
Grafana: Grafana complements Prometheus by providing rich visualization and dashboarding capabilities. Developers can create customisable dashboards to visualize metric data in real-time, enabling deep insights into application performance and behavior.

By integrating Prometheus with metrics-rs and visualizing the collected data using grafana, Rust developers can establish a comprehensive monitoring solution tailored to their specific requirements.

Libraries

Opentelemetry Metrics

This crate is the official implementation for OpenTelemetry. It is very verbose. We have to use opentelemetry::metrics to instrument your app and then opentelemetry_otlp::metrics to export to Prometheus.

The Metrics API consists of these main components

MeterProvider is the API entry point. It provides access to Meters.
Meter is the class responsible for creating instruments.
Instrument is accountable for reporting Measurements

RS-Metrics

In the Rust ecosystem, metrics-rs emerge as a powerful solution for instrumenting and collecting metrics within applications. Developed with simplicity, performance, and flexibility in mind, metrics-rs provides developers with a comprehensive toolkit for effortlessly integrating metrics into their Rust projects.

It has macros that make it very easy to use use, and the documentation is very simple and straightforward.

It supports all the metrics we need, has built-in exporters to Prometheus, and considering it's widely used in the Rust community, there is a sea of examples and implementations from which to draw inspiration.

Getting Started with `metrics-rs`

First, we need to add the metrics crate to our project. Quanta and Rand crates are used to create a demo.

[dependencies]
metrics = "0.22.3"
metrics-exporter-prometheus = "0.14.0"
metrics-util = "0.16.3"
quanta = "0.12.3"
rand = "0.8.5"

We will then create a new module called our_metrics that will contain all our setup and configuration. Doing it in a single place makes your code cleaner, and you can quickly know your applicaiton's metrics.

mod our_metrics;

fn main() {
}

We will create our Metric struct in this module with name and description as properties.

pub struct Metric {
    name: String,
    description: String
}

Using the previously created struct, we instantiate some dummy metrics we will use later in the demo.

pub const TCP_SERVER_LOOP_DELTA_SECS: Metric = Metric {
    name: "tcp_server_loop_delta_secs".to_owned(),
    description: "".to_owned(),
}

pub const IDLE: Metric = Metric {
    name: "idle".to_owned(),
    description: "".to_owned(),
}

pub const LUCKY_ITERATIONS: Metric = Metric {
    name: "lucky_iterations",
    description: "".to_owned(),
}

pub const TCP_SERVER_LOOPS: Metric = Metric {
    name: "tcp_server_loops",
    description: "",
}

Then we will have three constants for each Metric type: COUNTERS, GAUGES, and HISTOGRAMS, which will be an array of metrics. Think of these as buckets for each metric type.

pub const COUNTERS: [Metric; 2] = [TCP_SERVER_LOOPS,  IDLE];
pub const GAUGES: [Metric; 1] = [LUCKY_ITERATIONS];
pub const HISTOGRAMS: [Metric; 1] = [TCP_SERVER_LOOP_DELTA_SECS];

At the end of the file, I like adding utilities that make registering the metrics accessible.

/// Registers a counter with the given name.
fn register_counter(metric: Metric) {
    metrics::describe_counter!(metric.name, metric.description);
    let _counter = metrics::counter!(metric.name);
}

/// Registers a gauge with the given name.
fn register_gauge(metric: Metric) {
    metrics::describe_gauge!(metric.name, metric.description);
    let _gauge = ::metrics::gauge!(metric.name);
}

/// Registers a histogram with the given name.
fn register_histogram(metric: Metric) {
    metrics::describe_histogram!(metric.name, metric.description);
    let _histogram = ::metrics::histogram!(metric.name);
}

We will then have a function called init_metrics. This function should ideally be initialized as early as possible in your program. Its job is to initialize the metrics that we want to track. This function essentially does the following:

Initialize the Prometheus builder and configure options like the HTTP listener and the idle timeout.
We loop through each of the previously created arrays and register those metrics.

pub fn init_metrics(port: &u16) {
    println!("initializing metrics exporter");

    PrometheusBuilder::new()
        .idle_timeout(
            MetricKindMask::COUNTER | MetricKindMask::HISTOGRAM,
            Some(Duration::from_secs(10)),
        )
        .with_http_listener(SocketAddr::new(
            IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)),
            port.to_owned(),
        ))
        .install()
        .expect("failed to install Prometheus recorder");

    for name in COUNTERS {
        register_counter(name)
    }

    for name in GAUGES {
        register_gauge(name)
    }

    for name in HISTOGRAMS {
        register_histogram(name)
    }
}

Returning to our main.rs file, we import the init_metrics function in our primary function and call it to initialize the metrics.

To test that our setup works correctly, we will add some demo code that uses the previously created metrics and updates them.

mod our_metrics;

/// Make sure to run this example with `--features push-gateway` to properly enable push gateway support.
#[allow(unused_imports)]
use std::thread;
use std::time::Duration;

#[allow(unused_imports)]
use metrics::{counter, gauge, histogram};
#[allow(unused_imports)]
use metrics_exporter_prometheus::PrometheusBuilder;

use quanta::Clock;
use rand::{thread_rng, Rng};

use crate::our_metrics::init_metrics;

fn main() {
    init_metrics(&3000);

    let clock = Clock::new();
    let mut last = None;

    counter!(our_metrics::IDLE.name).increment(1);

    // Loop over and over, pretending to do some work.
    loop {
        counter!(our_metrics::TCP_SERVER_LOOPS.name, "system" => "foo").increment(1);

        if let Some(t) = last {
            let delta: Duration = clock.now() - t;
            histogram!(our_metrics::TCP_SERVER_LOOP_DELTA_SECS.name, "system" => "foo").record(delta);
        }

        let increment_gauge = thread_rng().gen_bool(0.75);
        let gauge = gauge!(our_metrics::LUCKY_ITERATIONS.name);
        if increment_gauge {
            gauge.increment(1.0);
        } else {
            gauge.decrement(1.0);
        }

        last = Some(clock.now());

        thread::sleep(Duration::from_millis(750));
    }
}