A step-by-step enablement guide for adding Grafana Loki centralized log aggregation to an existing Mattermost deployment that already uses Prometheus and Grafana for performance monitoring.
- Mattermost TAMs delivering services engagements
- Mattermost Channel Partners supporting customer deployments
- IT Ops / SysAdmin teams operating Mattermost in production
- Mattermost deployed on EC2 / bare-metal Linux servers
- Mattermost JSON file logging enabled (the default on all plans)
- Prometheus and Grafana already deployed per the Mattermost performance monitoring guide
- SSH access with sudo privileges
source/
deploy-loki-log-aggregation.rst # The main guide (RST, Mattermost docs style)
config/
loki-config.yaml # Production-ready Loki config (14-day retention)
otel-collector-config.yaml # OpenTelemetry Collector config for logs
dashboards/
mattermost-loki-logs.json # Grafana dashboard (import via UI)
- Read
source/deploy-loki-log-aggregation.rst— it walks through everything step by step. - Copy the config files below to the appropriate servers and edit the placeholders.
- Import the Grafana dashboard JSON.
Mattermost Servers Monitoring Server
┌──────────────────┐ ┌──────────────────┐
│ App + OTel Col │──push──│ Loki (:3100) │
│ App + OTel Col │──push──│ Prometheus (:9090)│
│ (optional) │ │ Grafana (:3000) │
│ DB + OTel Col │──push──│ │
└──────────────────┘ └──────────────────┘
Log retention defaults to 14 days. See the retention_period comments in the Loki config for instructions on adjusting this (30 days, 90 days, 1 year, etc.) and estimating disk usage.
Download each config file, edit the placeholders noted below, and copy to the target server. Full details are in the step-by-step guide.
Monitoring server — install to
/opt/loki/loki-config.yaml
Key settings to be aware of:
retention_period— Defaults to336h(14 days). Loki requires hours; see the comments in the file for 30d / 90d / 1yr values and disk usage estimates.compactor.retention_enabled— Must staytrueor retention is not enforced.storage.filesystem— Stores chunks and indexes under/opt/loki/data/. Ensure sufficient disk space for your retention window.
Mattermost and PostgreSQL servers — install to
/etc/otelcol-contrib/config.yaml
Placeholders to replace before starting:
| Placeholder | Replace with |
|---|---|
<LOKI_HOST> |
IP or hostname of the monitoring server |
<HOSTNAME> |
This server's hostname (e.g., mm-app-01) |
<SERVICE_NAME> |
The service type (e.g., mattermost or postgres) |
Key settings to be aware of:
filelogreceivers — Tails/opt/mattermost/logs/mattermost.logand/var/log/postgresql/*.json.otlphttpexporter — Sends data to Loki's OTLP endpoint (Loki 3.0+).
mattermost-loki-logs.json (Grafana dashboard)
Import into Grafana: Dashboards > New > Import > Upload JSON file
| Panel | Description |
|---|---|
| Log Volume by Level | Stacked bar chart — log rate by severity (color-coded) |
| Error / Warning / Total counters | Stat panels for the selected time range |
| HTTP 4xx/5xx Responses | Count of application-level HTTP errors |
| Error Rate Over Time | Time series per Mattermost instance |
| Top Error Messages | Table ranking most frequent errors (5 min windows) |
| Log Browser | Searchable, filterable log viewer with template variables |
Template variables: service_name, service_instance_id, detected_severity, search (free-text)
- Check Log Level: Ensure your Mattermost server is actually generating logs. If the server is idle and Log Level is set to
ERROR, the file will be empty.- Fix: Temporarily set File Log Level to
DEBUGorINFOin the System Console to verify flow.
- Fix: Temporarily set File Log Level to
- Check Connectivity: Run
curl -v http://<LOKI_HOST>:3100/readyfrom the app server to verify the firewall is open. - Check Permissions: Ensure the
otelcoluser can read the log file (e.g.,sudo -u otelcol cat /opt/mattermost/logs/mattermost.log).
Use this script to demonstrate the power of unified observability in Grafana (Metrics + Logs side-by-side).
-
Open Explore:
- Click the Compass icon (Explore) in the left sidebar.
- Split the view (button in top toolbar) so you have two panels.
-
Query Metrics (Left Panel):
- Data Source: Prometheus.
- Scenario A (Throughput):
- Metric:
sum(rate(mattermost_api_time_count[5m])) by (instance) - Meaning: Total API Requests per second per server.
- Metric:
- Scenario B (Saturation):
- Metric:
label_replace(100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100), "instance", "$1", "instance", "(.*):.*") - Meaning: CPU Usage % per server (port stripped for cleaner matching).
- Metric:
- Note: Run only one Scenario at a time. Explore uses a single Y-axis, so mixing "Requests/sec" (10s-100s) with "CPU %" or "Latency" (0-1) will distort the graph.
- Action: Click Run Query.
-
Query Logs (Right Panel):
- Data Source: Loki.
- Filter:
{service_name="mattermost"} | json | detected_level="error" - Action: Click Run Query.
-
Correlate:
- Click the Chain Icon (top toolbar) to sync time ranges.
- Zoom in on a spike in the CPU or Request graph.
- The Loki panel will automatically filter to show exactly the logs from that high-load timeframe.
-
Build Dashboard:
- Click Add to Dashboard -> New Dashboard.
- Layout the graph and logs side-by-side for a permanent "Battle Station" view.
-
Make it Dynamic (Bonus):
- "Explore" queries are static. To get Dropdowns (Variables):
- Step A: Go to Dashboard Settings (Gear Icon) -> Variables -> Add Variable.
- Step B: Configure the Variable:
- Name:
instance - Label:
Instance - Type:
Query - Data Source:
Prometheus - Query Type:
Label values - Label:
instance(Select from dropdown) - Metric: (Optional - Select
node_cpu_seconds_totalif you want to filter) - (Note: You don't type a raw query here; just use the dropdowns)
- Name:
- Step C: Make it Multi-select:
- Check Multi-value:
On - Check Include All option:
On
- Check Multi-value:
- Step D: Update your Panel Queries:
- Change:
sum(rate(mattermost_api_time_count[5m])) by (instance) - To:
sum(rate(mattermost_api_time_count{instance=~"$instance"}[5m])) by (instance) - (Notice the curly braces
{}and the=~for regex matching)
- Change:
- Step E: Click Apply. Now you have a professional dropdown!