Skip to content

Enterprise Monitoring Stack: Prometheus + Grafana + Auto-Recovery for Pi Node Docker#25

Open
Clawue884 wants to merge 17 commits intoPiCoreTeam:masterfrom
Clawue884:Clawue884-patch-9
Open

Enterprise Monitoring Stack: Prometheus + Grafana + Auto-Recovery for Pi Node Docker#25
Clawue884 wants to merge 17 commits intoPiCoreTeam:masterfrom
Clawue884:Clawue884-patch-9

Conversation

@Clawue884
Copy link

This PR introduces a full enterprise-grade monitoring, alerting, and recovery stack for Pi Node Docker, building on the foundation established in PR #20.

🚀 Key Features

  • Production-ready Prometheus + Grafana monitoring stack
  • Enterprise dashboard provisioning (grafana-dashboard.json)
  • Automated healthcheck + auto-recovery orchestration
  • Hardened production compose (docker-compose.production.yml)
  • One-command installer (setup-monitoring.sh)
  • Structured documentation (MONITORING.md)

🏗️ Architecture

  • Node Exporter → System metrics
  • Prometheus → Metrics scraping & storage
  • Grafana → Dashboards & visibility
  • Healthcheck → Failure detection
  • Alert Manager → Recovery & escalation logic

🎯 Benefits

  • High availability & fast failure detection
  • Automated service recovery
  • Operator-grade observability
  • Mainnet / withdrawal-ready infrastructure posture

📎 Related


Maintainer: @Clawue884
Enterprise Monitoring Initiative for Pi Node Docker

Implement an auto-recovery script that checks node health and restarts services if unhealthy.
Add a script to check the status of Horizon and Stellar-core services.
This script checks the health of the Horizon and stellar-core services by making HTTP requests and reporting their status.
Add health check script and configure Docker healthcheck.
This script continuously checks the health of services and restarts them if they are not healthy.
Implement a health check script to monitor services and disk space.
This script sets up the Enterprise Monitoring Stack for a Pi Node, including Prometheus configuration and starting the Docker services.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant