HON’s Wiki # Prometheus

Home / Monitoring

Contents

For metrics collection.

Info

Initial Setup (Docker)

Includes instructions for both the normal mode (aka server mode) and agent mode (no local storage).

  1. (Note) See (Prometheus) Installation.
  2. (Server mode) Set CLI args:
    • Set retention time: --storage.tsdb.retention.time=15d (for 15 days)
    • Alternatively, set retention size: --storage.tsdb.retention.size=100GB (for 100GB)
    • (Note) The old storage.local.* and storage.remote.* flags no longer work.
  3. (Agent mode) Set CLI args:
    • Enable: --enable-feature=agent
    • (Note) You can mount the data path, but it’s a bit pointless wrt. how short-lived the data is.
  4. Configure mounts:
    • Config: ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
    • Data (server mode): ./data/:/prometheus/:rw
  5. Configure prometheus.yml.
    • I.e. set global variables (like scrape_interval, scrape_timeout and evaluation_interval) and scrape configs.
  6. (Optional) Setup Cortex or Thanos for global view, HA and/or long-term storage. TODO: See Grafana Mimir too.

Configuration

Notes

Cortex and Thanos

TODO: This is outdated, see Grafana Mimir instead (based on Cortex).

Prometheus Exporters

General

List of Exporters and Software

This list contains exporters and software with built-in exposed metrics I typically use. Some are described in more detail in separate subsections.

Software with exposed metrics

Exporters

Special

Prometheus Node Exporter

Can be set up either using Docker (prom/node-exporter), using the package manager (prometheus-node-exporter on Debian), or by building it from source. The Docker method provides a small level of protection as it’s given only read-only system access. The package version is almost always out of date and is typically not optimal to use. If Docker isn’t available and you want the latest version, build it from source.

Setup (Downloaded Binary)

See Building and running.

Details:

Instructions:

  1. Install requirements: apt install moreutils
  2. Find the link to the latest tarball from the download page.
  3. Download and unzip it: wget <url> and tar xvf <file>
  4. Move the binary to the system: cp node_exporter*/node_exporter /usr/bin/prometheus-node-exporter
  5. Make sure it’s runnable: node_exporter -h
  6. Add the user: useradd -r prometheus
    • If you have hidepid setup to hide system process details from normal users, remember to add the user to a group with access to that information. This is only required for some metrics, most of them work fine without this extra access.
  7. Create the required files and directories:
    • touch /etc/default/prometheus-node-exporter
    • mkdir -p /var/lib/prometheus/node-exporter/
  8. Create the systemd service /etc/systemd/system/prometheus-node-exporter.service, see prometheus-node-exporter.service.
  9. (Optional) Configure it:
    • The defaults work fine.
    • File: /etc/default/prometheus-node-exporter
    • Example: ARGS="--collector.processes --collector.interrupts --collector.systemd" (enables more detailed process and interrupt collectors)
  10. Enable and start the service: systemctl enable --now prometheus-node-exporter
  11. (Optional) Setup textfile exporters.

Textfile Collector

Setup and Usage

  1. Set the collector script output directory using the CLI argument --collector.textfile.directory=<dir>.
    • Example dir: /var/lib/prometheus/node-exporter/
    • If the node exporter was installed as a package, it can be set in the ARGS variable in /etc/default/prometheus-node-exporter.
    • If using Docker, the CLI argument specified as part of the command.
  2. Download the collector scripts and make them executable.
    • Example dir: /opt/prometheus/node-exporter/textfile-collectors/
  3. Add cron jobs for the scripts using sponge to wrote to the output dir.
    • Make sure sponge is installed. For Debian, it’s found in the moreutils package.
    • Example cron file: /etc/cron.d/prometheus-node-exporter-textfile-collectors
    • Example cron entry: 0 * * * * root /opt/prometheus/node-exporter/textfile-collectors/apt.sh | sponge /var/lib/prometheus/node-exporter/apt.prom

Collector Scripts

Some I typically use.

Prometheus Blackbox Exporter

Monitor Service Availability

Add a HTTP probe job for the services and query for probe success over time.

Example query: avg_over_time(probe_success{job="node"}[1d]) * 100

Monitor for Expiring Certificates

Add a HTTP probe job for the services and query for probe_ssl_earliest_cert_expiry - time().

Example alert rule: probe_ssl_earliest_cert_expiry{job="blackbox"} - time() < 86400 * 30 (30 days)


hon.one | HON95/wiki | Edit page