Skip to content

Enterprise Health Monitoring & Auto-Recovery for Pi Node Docker#23

Open
Clawue884 wants to merge 17 commits intoPiCoreTeam:masterfrom
Clawue884:Clawue884-patch-2
Open

Enterprise Health Monitoring & Auto-Recovery for Pi Node Docker#23
Clawue884 wants to merge 17 commits intoPiCoreTeam:masterfrom
Clawue884:Clawue884-patch-2

Conversation

@Clawue884
Copy link

This PR introduces an enterprise-grade health monitoring and auto-recovery system for Pi Node Docker.

Key improvements:

  • Add healthcheck.sh for continuous service and disk monitoring
  • Add alert_manager.sh for alerting & recovery orchestration
  • Integrate auto-restart / recovery logic via supervisord
  • Provide docker-compose.monitoring.yml for production monitoring setup
  • Enable extensible monitoring for future Grafana / Prometheus integration

This improves node reliability, uptime, and operational resilience in production environments.

Implement an auto-recovery script that checks node health and restarts services if unhealthy.
Add a script to check the status of Horizon and Stellar-core services.
This script checks the health of the Horizon and stellar-core services by making HTTP requests and reporting their status.
Add health check script and configure Docker healthcheck.
This script continuously checks the health of services and restarts them if they are not healthy.
Implement a health check script to monitor services and disk space.
Add a metrics server script to serve metrics over HTTP.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant