One stack to watch them all.
Metrics, logs, dashboards, and Slack alerts—on your own servers.
A complete on-premises monitoring and observability stack. Collect metrics from your servers, visualize them in dashboards, store and search logs, and get alerts in Slack when something goes wrong—no cloud lock-in.
| If you want to… | Jump to… |
|---|---|
| Get the stack running in a few minutes | 🚀 Quick Start |
| See what’s inside the stack | 📦 What’s Included |
| Understand how it all connects | 🏗️ Architecture |
| Add more servers to monitor | 🖥️ Remote Server Monitoring |
| Send app logs to Grafana | 📋 Log Aggregation |
- 📦 What's Included
- 🏗️ Architecture Overview
- ✅ Prerequisites
- 🚀 Quick Start
- ⚙️ Configuration
- 🖥️ Remote Server Monitoring (Node Exporter)
- 📋 Log Aggregation (Fluent Bit → Loki)
- 🔗 Accessing the Stack
- 📊 Dashboards & Alerts
- 🧹 Maintenance & Cleanup
- 📋 Summary Checklist
| Component | What it does | Think of it as… |
|---|---|---|
| Prometheus | Collects and stores metrics from your servers | The “metrics database” |
| Grafana | Beautiful dashboards and log search | The “control panel” you open in a browser |
| Loki | Stores and indexes logs from your apps | “Prometheus, but for logs” |
| Alertmanager | Sends alerts to Slack (or other channels) | The “notification router” |
| Node Exporter | Exposes CPU, memory, disk per host | The “server health sensor” |
| cAdvisor | Exposes container (Docker) metrics | The “container health sensor” |
| MongoDB Exporter | Exposes MongoDB metrics | Optional “MongoDB health sensor” |
| Nginx | Proxies Grafana, Prometheus, Loki, Alertmanager | The “front door” with friendly URLs |
| Fluent Bit (on remote hosts) | Sends app logs to Loki | The “log shipper” on each app server |
In one sentence: Remote servers send metrics (via Node Exporter) and logs (via Fluent Bit) to your monitoring server; Prometheus and Loki store them; Grafana shows dashboards; Alertmanager sends alerts to Slack.
flowchart LR
subgraph Remote["🖥️ Remote servers"]
NE[Node Exporter]
FB[Fluent Bit]
end
subgraph Monitor["🖥️ Your monitoring server"]
P[Prometheus]
L[Loki]
A[Alertmanager]
G[Grafana]
N[Nginx]
end
SLACK[Slack 💬]
NE -->|metrics :9100| P
FB -->|logs :3100| L
P -->|alerts| A
A --> SLACK
P --> G
L --> G
N --> G
N --> P
Before you start, make sure you have:
| Need | What to have |
|---|---|
| 🐳 Docker | Docker and Docker Compose on the machine that will run the stack |
| 📢 Slack alerts (optional) | Slack Incoming Webhooks for the channels you want alerts in |
| 🔑 SSH | Access to any remote servers where you’ll install Node Exporter |
| 🍃 MongoDB (optional) | A MongoDB connection URI if you use the MongoDB Exporter |
| 🌐 Domains (optional) | DNS for your Grafana/Prometheus URLs (e.g. grafana.yourdomain.com) pointing to this server |
Goal: Run the full stack with one command, then open Grafana.
git clone <this-repo-url> devops-monitoring
cd devops-monitoring(Or download and extract the repo as a folder.)
- Open
prometheus/prometheus.ymland replace every<on-prem-n8n-ip>,<on-prem-api-ip>, etc., with the real IP or hostname of that server (or uselocalhostfor a single-machine test). - (Optional) For Slack: create files
alertmanager/slack_url_project0,slack_url_project1, … with your webhook URLs (see Alertmanager – Slack). - (Optional) For MongoDB metrics: set
MONGODB_URIindocker-compose.ymlfor themongo-exporterservice.
docker compose up -d- URL: http://localhost:3000 (or your server IP and port 3000).
- Login: Use the credentials in
grafana/config.monitoring(keep this file secret in production).
You should see pre-loaded dashboards (e.g. Node Exporter). Prometheus is already configured as a data source.
💡 Tip: If you only have this one server, the built-in node-exporter container already exposes metrics; you’ll see data in the Node Exporter dashboard even before adding remote hosts.
File: prometheus/prometheus.yml
Replace each placeholder with the IP or hostname of the server that runs Node Exporter on port 9100:
| Placeholder | Server |
|---|---|
<on-prem-n8n-ip> |
N8N server |
<on-prem-api-ip> |
On-prem API |
<on-prem-db-ip> |
On-prem DB |
<on-prem-admin-ip> |
On-prem Admin |
<remote-api-ip> |
Remote API |
<remote-db-ip> |
Remote DB |
<remote-admin-ip> |
Remote Admin |
⚠️ Each of these hosts must have Node Exporter running and port 9100 open (see Remote Server Monitoring).
File: docker-compose.yml → mongo-exporter service
If you use the MongoDB Exporter, set your connection string:
mongo-exporter:
environment:
- MONGODB_URI=mongodb://user:password@host:27017If the exporter runs on another host/port, update the target in prometheus/prometheus.yml under job_name: 'mongodb-exporter'.
Alerts are routed by project (e.g. project0, project1). Each project uses a Slack webhook stored in a file.
- Create a Slack Incoming Webhook for each channel.
- On the monitoring server, create one file per project (only the URL inside, no extra text):
alertmanager/slack_url_project0alertmanager/slack_url_project1- … up to
slack_url_project4
- Put the webhook URL in the file, e.g.:
https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX
Route labels in alertmanager/config.yml (e.g. project0, project1) must match the labels you set in Prometheus.
File: nginx/nginx.conf
To use your own domains (e.g. grafana.yourdomain.com):
- Replace
grafana.ujwal5ghare.xyz,prom.ujwal5ghare.xyz, etc., with your domains. - Point DNS for those domains to this server’s IP.
- For HTTPS, add SSL config and certificates (not included in the sample).
To monitor other servers (API, DB, admin, N8N, etc.), install Node Exporter on each host so Prometheus can scrape metrics on port 9100.
- Copy to the host:
ssh-file/node_exp_install.shssh-file/node_exporter.service(if the script expects it)
- SSH in and run:
This downloads Node Exporter, installs it, and sets up a systemd service.
chmod +x node_exp_install.sh ./node_exp_install.sh
- Open port 9100 on that host (firewall / security group).
- Add that host’s IP to
prometheus/prometheus.ymlin the right job (see Prometheus scrape targets).
On the remote host:
sudo bash node_exp_uninstall.shSend application logs (e.g. from PM2) to Loki so you can search them in Grafana.
- Monitoring server: Loki must be running and reachable (port 3100 or via Nginx).
- App server: Install Fluent Bit and use
ssh-file/fluentbit_setup.sh(andssh-file/fluent-bit.confas reference).
When you run fluentbit_setup.sh, it asks for:
| Input | Example | Purpose |
|---|---|---|
| Log filename | index, remote-admin |
PM2 log: ~/.pm2/logs/<name>-out.log and -error.log |
| Environment | staging, prod |
Loki label env |
| Service name | dilicut-admin, remote |
Loki label service |
Before running:
- Set the Loki host and port in the script/config to your monitoring server (e.g. IP and
3100). - Ensure the app server can reach that host/port (firewall/security group).
After setup, query logs in Grafana via Explore and the Loki data source.
| Service | With Nginx (custom domains) | Direct (no Nginx) |
|---|---|---|
| Grafana | http://grafana.yourdomain.com | http://<server-ip>:3000 |
| Prometheus | http://prom.yourdomain.com | http://<server-ip>:9090 |
| Loki | http://loki.yourdomain.com | http://<server-ip>:3100 |
| Alertmanager | http://alerts.yourdomain.com | http://<server-ip>:9093 |
Pre-provisioned dashboards (from grafana/provisioning/dashboards/):
| Dashboard | What it shows |
|---|---|
| Node Exporter Full | CPU, memory, disk, network per host |
| K8s-* (Pods, Nodes, Cluster, API Server, Global) | Kubernetes-oriented views |
| MongoDB Profiler | MongoDB metrics (when MongoDB Exporter is used) |
| Athena | For Athena data source (if configured) |
The Prometheus data source is provisioned via grafana/provisioning/datasources/datasource.yml.
Rules live in prometheus/alertmanager.yml. Examples:
| Alert | When it fires |
|---|---|
| InstanceDown | Target unreachable for 5 minutes |
| HighCPUUsage | CPU > 90% for 5 minutes |
| HighMemoryUsage | Memory > 90% for 5 minutes |
| HighSwapUsage | Swap > 95% for 5 minutes |
| HighDiskIOUsage | Disk I/O > 85% for 5 minutes |
| DiskSpaceRunningOut | Root filesystem > 95% for 5 minutes |
Alertmanager routes them to Slack by project (see Alertmanager – Slack).
| Action | Command |
|---|---|
| Restart stack | docker compose restart |
| Stop stack | docker compose down (add -v only if you want to delete volume data) |
| View logs | docker compose logs -f or docker compose logs -f prometheus |
| Uninstall Node Exporter (on remote host) | sudo bash ssh-file/node_exp_uninstall.sh |
| Uninstall Prometheus (on a host where you ran it manually) | Use ssh-file/prometheus_uninstall.sh if available |
Use this before you say “done”:
- Docker and Docker Compose installed
- All
<...>placeholders inprometheus/prometheus.ymlreplaced with real IPs/hostnames - Node Exporter installed on every host you want to monitor (port 9100 open)
- Slack webhook files under
alertmanager/if you use Slack alerts -
MONGODB_URIset indocker-compose.ymlif using MongoDB Exporter - Nginx server names updated in
nginx/nginx.confif using custom domains -
docker compose up -drun and all containers healthy - Grafana login works; Prometheus (and Loki) datasources available
- (Optional) Fluent Bit on app servers sending logs to Loki
- Prometheus · Grafana · Loki · Alertmanager
