This repository implements a Distributed PostgreSQL Database architecture designed for High Availability (HA) and resilience, fully orchestrated within Docker.
The core objective of this project is to validate how a distributed system handles critical failures. Instead of theoretical setups, I focused on Resilience Testing by:
- Manual Failure Simulation: Forcibly stopping Docker containers (Master/Slave nodes) to trigger real-time Failover and Failback events via repmgr.
- Traffic Stability Analysis: Running concurrent stress tests during these outages to verify that Pgpool-II successfully reroutes queries and maintains service availability with minimal downtime.
- Observability: Utilizing a full monitoring stack (Prometheus, Grafana, Postgres Exporter) to visually track replication lag and cluster health during the chaos.
- Run the containers
docker compse up -d- Open your grafana at
http://localhost:3000and login with password indocker-compose.yml - From dashboard, open menu -> Connections -> Data Sources
- Set Connection to
http://prometheus:9000and click Save & Test - Back to dashboard click new button -> import.
- at Grafana dashboard URL or ID set value 9628 (postgreSQL).
- Click load and the dashboard ready to use