Prometheus + Grafana — monitoring stack dla VPS i serwerów
Opublikowano: 10 kwietnia 2026 · Kategoria: VPS / Monitoring
Prometheus i Grafana to dziś standard monitoringu w środowiskach cloud-native i on-premise. Prometheus zbiera metryki metodą pull — sam odpytuje eksportery co 15 sekund i przechowuje dane w lokalnej bazie time series. Grafana wizualizuje te dane na dashboardach z wykresami i alertami. Ten artykuł przeprowadzi cię przez pełną instalację: Prometheus + node_exporter (metryki systemu), Grafana (dashboardy) i Alertmanager (powiadomienia Slack/email). Całość postawisz w godzinę.
Architektura stack Prometheus
| Komponent | Rola | Port |
|---|---|---|
| Prometheus | Scraper metryk + TSDB + PromQL + reguły alertów | 9090 |
| node_exporter | Metryki systemu Linux: CPU, RAM, dysk, sieć | 9100 |
| Grafana | Dashboardy, wykresy, alerty wizualne | 3000 |
| Alertmanager | Routing alertów: Slack, email, PagerDuty, webhook | 9093 |
| Blackbox Exporter | HTTP/HTTPS/TCP/ICMP probe — monitoring dostępności URL | 9115 |
Instalacja przez Docker Compose
Najwygodniej postawić cały stack przez Docker Compose. Utwórz katalog
/opt/monitoring/ i plik docker-compose.yml:
version: "3.8"
services:
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./rules/:/etc/prometheus/rules/
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--storage.tsdb.retention.time=15d'
- '--web.enable-lifecycle'
ports:
- "127.0.0.1:9090:9090"
restart: unless-stopped
grafana:
image: grafana/grafana:latest
volumes:
- grafana_data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=zmien-to-haslo
- GF_USERS_ALLOW_SIGN_UP=false
ports:
- "127.0.0.1:3000:3000"
restart: unless-stopped
alertmanager:
image: prom/alertmanager:latest
volumes:
- ./alertmanager.yml:/etc/alertmanager/alertmanager.yml
ports:
- "127.0.0.1:9093:9093"
restart: unless-stopped
node_exporter:
image: prom/node-exporter:latest
network_mode: host
pid: host
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
restart: unless-stopped
volumes:
prometheus_data:
grafana_data: Konfiguracja prometheus.yml
# /opt/monitoring/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
env: 'production'
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']
rule_files:
- 'rules/*.yml'
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node'
static_configs:
- targets:
- 'localhost:9100' # serwer monitoring
- 'web01:9100' # serwer www 1
- 'web02:9100' # serwer www 2
- 'db01:9100' # serwer bazy danych
relabel_configs:
- source_labels: [__address__]
target_label: instance
- job_name: 'blackbox'
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets:
- https://example.com
- https://api.example.com/health
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackbox:9115 Instalacja node_exporter na serwerach bez Dockera
# Pobierz node_exporter (sprawdz aktualna wersje na github.com/prometheus/node_exporter) wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz tar xvf node_exporter-1.8.2.linux-amd64.tar.gz sudo mv node_exporter-1.8.2.linux-amd64/node_exporter /usr/local/bin/ sudo useradd --no-create-home --shell /bin/false node_exporter # Usluga systemd sudo tee /etc/systemd/system/node_exporter.service <<'UNIT' [Unit] Description=Node Exporter After=network.target [Service] User=node_exporter ExecStart=/usr/local/bin/node_exporter Restart=on-failure [Install] WantedBy=multi-user.target UNIT sudo systemctl daemon-reload sudo systemctl enable --now node_exporter # Sprawdz czy dziala curl http://localhost:9100/metrics | grep node_cpu
PromQL — podstawowe zapytania
PromQL to potężny język do analizy metryk. Poniżej najważniejsze zapytania do monitoringu serwera — możesz je wkleić bezpośrednio w Grafanie podczas tworzenia paneli:
# Uzycie CPU w procentach
100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
# Zajeta pamiec RAM (%)
100 * (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes))
# Uzycie dysku (%)
100 * (1 - (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}))
# Ruch sieciowy (MB/s) - wyslane
rate(node_network_transmit_bytes_total{device!="lo"}[5m]) / 1024 / 1024
# Load average 1-minutowy
node_load1
# Liczba procesow
node_procs_running
# Uptime serwera (dni)
(time() - node_boot_time_seconds) / 86400
# HTTP RPS (jesli masz nginx-prometheus-exporter)
rate(nginx_http_requests_total[5m])
# P95 latencja (z histogram_quantile)
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) Reguły alertów (alerting rules)
# /opt/monitoring/rules/node_alerts.yml
groups:
- name: node_alerts
rules:
- alert: HighCpuUsage
expr: 100 - (avg by(instance)(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 85
for: 5m
labels:
severity: warning
annotations:
summary: "Wysokie uzycie CPU na {{ $labels.instance }}"
description: "CPU: {{ $value | printf \"%.1f\" }}% przez ostatnie 5 minut"
- alert: HighMemoryUsage
expr: 100 * (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) > 90
for: 3m
labels:
severity: critical
annotations:
summary: "Krytyczne uzycie RAM na {{ $labels.instance }}"
- alert: DiskSpaceLow
expr: 100 * (1 - (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"})) > 85
for: 10m
labels:
severity: warning
annotations:
summary: "Malo miejsca na dysku: {{ $labels.instance }}"
- alert: InstanceDown
expr: up == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Serwer niedostepny: {{ $labels.instance }}" Alertmanager — routing na Slack i email
# /opt/monitoring/alertmanager.yml global: smtp_from: '[email protected]' smtp_smarthost: 'smtp.example.com:587' smtp_auth_username: '[email protected]' smtp_auth_password: 'haslo-smtp' route: receiver: 'slack-ops' group_by: ['alertname', 'instance'] group_wait: 30s group_interval: 5m repeat_interval: 4h routes: - match: severity: critical receiver: 'pagerduty-critical' - match: severity: warning receiver: 'slack-ops' receivers: - name: 'slack-ops' slack_configs: - api_url: 'https://hooks.slack.com/services/TWOJ-WEBHOOK' channel: '#ops-alerts' title: '{{ .GroupLabels.alertname }}' text: '{{ range .Alerts }}{{ .Annotations.summary }} {{ end }}' - name: 'pagerduty-critical' pagerduty_configs: - service_key: 'twoj-pagerduty-key' inhibit_rules: - source_match: alertname: 'InstanceDown' target_match_re: alertname: '.*' equal: ['instance']
Grafana — importowanie gotowych dashboardów
Grafana ma repozytorium gotowych dashboardów na grafana.com/grafana/dashboards.
Najważniejsze ID do importu (Dashboards → Import → wklej ID):
- ID 1860 — Node Exporter Full — kompletny dashboard systemu Linux (CPU, RAM, dysk, sieć, temperatura, procesy)
- ID 13978 — Node Exporter Quickstart — uproszczony, czytelny overview
- ID 9614 — NGINX — metryki nginx-prometheus-exporter (RPS, kody HTTP, connections)
- ID 9628 — PostgreSQL Database — metryki postgres_exporter
- ID 14191 — Blackbox Exporter — status HTTP probe, SSL expiry countdown
Po imporcie ustaw data source na Prometheus i wybierz odpowiednią instancję z dropdowna. Gotowe dashboardy pokryją 80% potrzeb monitoringu bez pisania własnych zapytań PromQL.
Nginx reverse proxy przed Grafaną
server {
listen 443 ssl http2;
server_name monitoring.example.com;
ssl_certificate /etc/letsencrypt/live/monitoring.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/monitoring.example.com/privkey.pem;
location / {
proxy_pass http://127.0.0.1:3000;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}