.. | ||
alertmanager.yaml | ||
blackbox-exporter.yaml | ||
mikrotik-exporter.yaml | ||
node-exporter.yaml | ||
ping-exporter.yaml | ||
prometheus.yaml | ||
README.md | ||
snmp-configs.yaml | ||
snmp-exporter.yaml | ||
zrepl.yaml |
Monitoring
Prometheus is accessible at prom.k-space.ee and the corresponding AlertManager is accessible at am.k-space.ee. Both are deployed by ArgoCD from this Git repo directory using Prometheus operator.
Alerts are sent to #kube-prod Slack channel
Sample queries:
- SSD/HDD temperatures
- HDD power on hours, 8760 hours per year
- CPU/NB temperatures
- Disk space left
- Minio s3 egress, internode egress, storage used
To reconfigure SNMP targets etc:
kubectl delete -n monitoring configmap snmp-exporter
kubectl create -n monitoring configmap snmp-exporter --from-file=snmp.yml=snmp-configs.yaml
To set Slack secrets:
kubectl create -n monitoring secret generic slack-secrets \
--from-literal=webhook-url=https://hooks.slack.com/services/...
To set Mikrotik secrets:
kubectl create -n monitoring secret generic mikrotik-exporter \
--from-literal=MIKROTIK_PASSWORD='f7W!H*Pu' \
--from-literal=PROMETHEUS_BEARER_TOKEN=$(cat /dev/urandom | base64 | head -c 30)