Grafana provides monitoring dashboards and visualization for Charon infrastructure and services.
Purpose: Monitoring and visualization dashboards Version: 8.5.0 (Helm chart) Port: 3000 (application), 443 (nginx-tls) Storage: Persistent volume for dashboards and datasources Access: VPN-only via HTTPS Authentication: LDAP via FreeIPA (optional)
- Dashboards - Custom dashboards for infrastructure monitoring (Kubernetes, Headscale, Open-WebUI, GPU)
- Data Sources - Prometheus, Loki, Tempo with traces-to-metrics/logs correlations
- Distributed Tracing - Tempo integration with OpenTelemetry traces from Open-WebUI
- Alerting - Alert rules and notifications
- LDAP Integration - User authentication via FreeIPA
- Git Sync - Automatic dashboard sync from Git repository
βββββββββββββββββββββββββββββββββββββββ
β Grafana Pod β
βββββββββββββββββββββββββββββββββββββββ€
β ββββββββββββ ββββββββββββββββββββ β
β β nginx- βββΆβ Grafana β β
β β tls β β (port 3000) β β
β β (443) β ββββββββββββββββββββ β
β ββββββββββββ β β
β βΌ β
β ββββββββββββββ β
β β Prometheus β β
β β Loki β β
β ββββββββββββββ β
β β
β ββββββββββββ β
β βTailscale β VPN connectivity β
β ββββββββββββ β
βββββββββββββββββββββββββββββββββββββββ
# terraform.tfvars
grafana_enabled = true
grafana_version = "8.5.0"
grafana_hostname = "grafana.example.com"
grafana_admin_password = "your-secure-password"
grafana_tailscale_enabled = true
# Git sync for dashboards (optional)
grafana_dashboards_git_enabled = true
grafana_dashboards_git_repo = "https://github.com/org/repo"
grafana_dashboards_git_branch = "main"
grafana_dashboards_git_token = "github-token"
grafana_dashboards_git_sync_interval = 60grafana_cpu_request = "500m"
grafana_memory_request = "512Mi"
grafana_cpu_limit = "1"
grafana_memory_limit = "1Gi"# Via VPN
open https://grafana.example.com
# Default credentials
# Username: admin
# Password: <grafana_admin_password>- Connect to VPN
- Navigate to
https://grafana.example.com - Login with admin credentials
- Change default password (recommended)
Grafana integrates with FreeIPA for user authentication:
# Configured automatically via terraform/grafana.tf
auth.ldap:
enabled: true
config_file: /etc/grafana/ldap.toml
ldap.toml:
host: freeipa.dev.svc.cluster.local
port: 636
use_ssl: true
bind_dn: uid=admin,cn=users,cn=accounts,dc=dev,dc=svc,dc=cluster,dc=local
search_base_dns: ["cn=users,cn=accounts,dc=dev,dc=svc,dc=cluster,dc=local"]
search_filter: (uid=%s)FreeIPA users can log in with their LDAP credentials:
- Username: FreeIPA username
- Password: FreeIPA password
Users in FreeIPA admins group get Grafana Admin role.
Grafana dashboards are automatically provisioned from a Git repository using the git-sync sidecar pattern. This is the primary method for managing dashboards.
Required variables in terraform.tfvars:
grafana_dashboards_git_enabled = true
grafana_dashboards_git_repo = "https://github.com/your-org/grafana-dashboards"
grafana_dashboards_git_branch = "main"
grafana_dashboards_git_token = "your-github-token"
grafana_dashboards_git_sync_interval = 60 # seconds- Git-sync sidecar container runs alongside Grafana
- Clones repository to
/var/lib/grafana/dashboardsevery 60 seconds - Grafana auto-discovers JSON dashboard files via provisioning
- Changes in Git automatically sync to Grafana
- Supports private repositories via GitHub Personal Access Token
Organize your dashboards in a Git repository:
grafana-dashboards/
βββ infrastructure/
β βββ kubernetes-cluster.json
β βββ docker-containers.json
β βββ linux-host.json
βββ applications/
β βββ headscale-vpn.json
β βββ service-metrics.json
βββ gpu/
β βββ nvidia-dcgm.json
βββ README.md
Dashboards must be valid Grafana JSON format:
{
"id": null,
"title": "My Dashboard",
"tags": ["custom"],
"timezone": "browser",
"panels": [...],
"time": {"from": "now-1h", "to": "now"},
"refresh": "30s"
}- Commit dashboard JSON to your Git repository
- Wait up to 60 seconds for git-sync to pull changes
- Refresh Grafana UI to see new dashboards
- Check git-sync logs if dashboards don't appear:
kubectl logs -n monitoring grafana-0 -c git-sync -fkubectl exec -n dev grafana-0 -c grafana -- \
grafana-cli admin reset-admin-password NewPassword123kubectl exec -n dev grafana-0 -c grafana -- \
grafana-cli plugins install <plugin-name>
# Restart Grafana
kubectl delete pod grafana-0 -n dev# Export all dashboards via API
curl -u admin:password https://grafana.example.com/api/dashboards/dbGrafana automatically connects to the appropriate metrics backend based on your configuration.
When Thanos is ENABLED (thanos_enabled = true):
- Data Source: Thanos Query
- URL:
http://thanos-query.monitoring.svc.cluster.local:9090 - Type: Prometheus
- Access: Server (default)
- Benefits: Long-term retention (30/90/180 days), downsampling, global query view
When Thanos is DISABLED (thanos_enabled = false, current default):
- Data Source: Prometheus Server
- URL:
http://prometheus-server.monitoring.svc.cluster.local - Type: Prometheus
- Access: Server (default)
- Benefits: Simpler setup, lower resource usage, 15-day retention
IMPORTANT: Dashboard queries use PromQL syntax regardless of which backend is configured. Queries are identical whether using Prometheus or Thanos.
Configured automatically if loki_enabled = true:
- URL:
http://loki.monitoring.svc.cluster.local:3100 - Type: Loki
- Access: Server (default)
- Storage: emptyDir (ephemeral, resets on pod restart)
Configured automatically if tempo_enabled = true:
- URL:
http://tempo.monitoring.svc.cluster.local:3200 - Type: Tempo
- Access: Server (default)
- Features: Distributed tracing, traces-to-metrics, traces-to-logs correlations
- Integration: Receives OpenTelemetry traces from Open-WebUI (gRPC on port 4317)
# Check pod status
kubectl get pods -n dev grafana-0
# Check ingress
kubectl get ingress -n dev grafana
# Verify VPN connection
tailscale status# Check Grafana logs
kubectl logs -n dev grafana-0 -c grafana | grep -i ldap
# Verify FreeIPA is accessible
kubectl exec -n dev grafana-0 -c grafana -- \
nc -zv freeipa.dev.svc.cluster.local 636
# Test LDAP bind
kubectl exec -n dev grafana-0 -c grafana -- \
ldapsearch -x -H ldaps://freeipa.dev.svc.cluster.local:636 \
-D "uid=admin,cn=users,cn=accounts,dc=dev,dc=svc,dc=cluster,dc=local" \
-w "ADMIN_PASSWORD" \
-b "cn=users,cn=accounts,dc=dev,dc=svc,dc=cluster,dc=local"Check git-sync sidecar logs:
kubectl logs -n monitoring grafana-0 -c git-sync -fCommon issues:
- Invalid Git token - Verify token has read permissions
- Repository not accessible - Check repository URL and network connectivity
- Branch name incorrect - Verify branch exists in repository
- Authentication failed - Regenerate GitHub Personal Access Token
Verify git-sync environment variables:
kubectl exec -n monitoring grafana-0 -c git-sync -- env | grep GIT_SYNCCheck dashboard provisioning:
# Verify provisioning config
kubectl exec -n monitoring grafana-0 -c grafana -- \
cat /etc/grafana/provisioning/dashboards/dashboards.yaml
# Check dashboard files in volume
kubectl exec -n monitoring grafana-0 -c grafana -- \
ls -la /var/lib/grafana/dashboards
# Verify Grafana can read dashboards
kubectl logs -n monitoring grafana-0 -c grafana | grep -i dashboardIf dashboards still don't appear:
- Verify JSON files are valid Grafana dashboard format
- Check file permissions in volume
- Restart Grafana pod to reload provisioning:
kubectl delete pod grafana-0 -n monitoringSymptoms: All dashboards show "No data" but Grafana is working
Diagnosis:
# Check data source configuration
kubectl exec -n monitoring grafana-0 -c grafana -- \
cat /etc/grafana/provisioning/datasources/datasources.yaml
# Test Prometheus/Thanos endpoint
kubectl exec -n monitoring grafana-0 -c grafana -- \
curl http://prometheus-server.monitoring.svc.cluster.local/api/v1/query?query=upCommon Causes:
-
Data source URL mismatch:
- Thanos enabled but Grafana pointing to Prometheus
- Thanos disabled but Grafana pointing to Thanos Query
-
Dashboard queries have wrong prefix:
- Check if queries use
thanos_prefix when they shouldn't
- Check if queries use
-
Metrics backend not ready:
- Wait 2-5 minutes after deployment
- Check Prometheus/Thanos pod logs
Solution:
Verify data source matches your Terraform configuration:
# Check if Thanos is enabled
kubectl get pods -n monitoring | grep thanos
# If Thanos pods exist: data source should be thanos-query:9090
# If no Thanos pods: data source should be prometheus-serverSee Monitoring Guide for detailed troubleshooting.
- Admin password stored as Kubernetes secret
- LDAPS encryption for LDAP authentication
- VPN-only access via Tailscale
- IP allowlisting (100.64.0.0/10)
- TLS certificates via cert-manager
grafana_cpu_limit = "2"
grafana_memory_limit = "2Gi"- Use dashboard time range wisely
- Limit number of panels per dashboard
- Use Prometheus query optimizations
- Enable query caching
Navigation: Documentation Index | Home