Validator Monitoring

DENTNet supports Prometheus to monitor the performance of your validator nodes.

You can follow these steps to set up your monitoring, incl. Grafana or use an individual setup.

This document is an example of how to set up monitoring. Please perform your setup based on your network structure and security setup.

Set up a dedicated monitoring machine

The monitoring should run on a machine different from your node. The monitoring machine will access the node on port 9615 to fetch the current status.

Make sure your network and firewall setup allows this communication.

We strongly recommend opening up port 9615 only between your monitoring machine and your node.

Enable Prometheus port on the DENTNet node(s)

Connect to each of your DENTNet nodes you want to monitor, find your docker-compose.yml, and edit it:

Add a new mapping in the ports: section

- "9615:9615"

Add the lines to the command: section

- "--prometheus-external"
- "--prometheus-port=9615"

Save the file and restart the docker for the new settings to take effect.

docker-compose up -d

Download Prometheus and Alertmanager

First, connect to your monitoring machine.

Download Prometheus at https://prometheus.io/download/ and untar using

tar xvfz prometheus-*.tar.gz

Download Alertmanager from https://prometheus.io/download/ and untar using

tar xvfz alertmanager-*.tar.gz

Edit Prometheus Configuration

Create the Prometheus config directory

mkdir /etc/prometheus

Create a file at /etc/prometheus/rules.yml to set up how you will be alerted. Add this sample content or modify it as needed:

groups:
  - name: alert_rules
    rules:
      - alert: InstanceDown
        expr: up == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Instance [{{ $labels.instance }}] down"
          description: "[{{ $labels.instance }}] of job [{{ $labels.job }}] has been down for more than 5 minutes."

Next, create a file at /etc/prometheus/prometheus.yml.

Ensure you enter your nodes' correct IPs in the last line of the file ("targets").

prometheus.yml
global:
  scrape_interval: 15s
  scrape_timeout: 10s
  evaluation_interval: 15s
alerting:
  alertmanagers:
  - static_configs:
    - targets: [127.0.0.1:9093]
    scheme: http
    timeout: 10s
    api_version: v2
rule_files:
  - '/etc/prometheus/rules.yml'
scrape_configs:
- job_name: prometheus
  honor_timestamps: true
  scrape_interval: 15s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  static_configs:
  - targets:
    - localhost:9090
- job_name: "substrate_node"
  scrape_interval: 5s
  static_configs:
  - targets: [":9615", "<>:9615"]

Edit Alertmanager Configuration

Create the Alertmanager config directory

mkdir /etc/alertmanager

Create a configuration file named alertmanager.yml under /etc/alertmanager. Edit alermanager.yml to fit your needs.

An example file with SMTP email alerts

global:
 resolve_timeout: 1m

route:
 receiver: 'smtp-notifications'

receivers:
- name: 'smtp-notifications'
  email_configs:
  - to: [email protected]
    from: [email protected]
    smarthost: smtp.example.com:587
    auth_username: my-user
    auth_password: my-password
    send_resolved: true

Start Prometheus and Alertmanager

Start Alertmanager and Prometheus, e.g. by entering

# change to alertmanager directory
./alertmanager --config.file=/etc/alertmanager/alertmanager.yml &
# change to prometheus directory
./prometheus --config.file=/etc/prometheus/prometheus.yml &

You can check the output for error messages to make sure your setup is working.

Add Grafana

Follow the instructions to download and install Grafana at https://grafana.com/docs/grafana/latest/setup-grafana/installation/

In your Grafana installation add the data source for your Prometheus instance, e.g. by adding a datasource.yml file:

vi /etc/grafana/provisioning/datasources/datasource.yml

Insert the needed settings:

apiVersion: 1

datasources:
- name: Prometheus
  type: prometheus
  url: http://localhost:9090
  isDefault: true
  access: proxy
  editable: true

Finally, restart Grafana on your monitoring machine.

Open Grafana in Browser

Open the Grafana web interface using your monitoring machine address and the Grafana port, such as http://localhost:3000 . The default username and password are admin/admin.

Add a Dashboard

Create a new dashboard by using the Import function and this URL https://grafana.com/grafana/dashboards/13759-substrate-node-template-metrics/ and your Prometheus data source from the dropdown.

You can now see the stats of your validator(s) on Grafana and you will be alerted via email if an instance is down.

Last updated