feat: Add Grafana dashboard for monitoring

# feat: Add Grafana dashboard for monitoring

Provide a pre-built Grafana dashboard for monitoring gatekeeperd instances, with automatic provisioning support for Kubernetes deployments.

## Motivation

Gatekeeperd exposes Prometheus metrics (requests, latency, verification failures, relay stats, etc.) but there's no pre-built dashboard to visualize them. Users currently need to build dashboards from scratch.

## Proposed Approach

Provide the dashboard in multiple ways to support different deployment scenarios. These options are **complementary, not mutually exclusive**:

| Layer | What it provides | Who uses it |
|-------|------------------|-------------|
| Dashboard JSON file | Source of truth, manual import | Everyone |
| Helm ConfigMap | Auto-provisioning via Grafana sidecar | Kubernetes + Grafana Helm chart |
| ServiceMonitor | Auto-discovery of metrics endpoint | Kubernetes + Prometheus Operator |

A typical kube-prometheus-stack user would enable both Helm options. A Docker user would just grab the JSON file.

### 1. Dashboard JSON file (all users)

Add a standalone dashboard JSON file that can be imported manually:

```
dashboards/
  grafana-gatekeeperd.json
```

This works for any Grafana deployment (Kubernetes, Docker, bare metal).

### 2. Helm chart ConfigMap with sidecar label (Kubernetes users)

The standard Kubernetes pattern for Grafana dashboard provisioning uses a ConfigMap with a specific label. The Grafana Helm chart (and kube-prometheus-stack) includes a sidecar that watches for ConfigMaps labeled `grafana_dashboard: "1"` and automatically loads them.

Add to Helm values:

```yaml
grafana:
  # Create a ConfigMap with the dashboard for Grafana sidecar auto-discovery
  dashboard:
    enabled: false
    # Label for Grafana sidecar to discover the dashboard
    # Match your Grafana sidecar configuration (default: grafana_dashboard)
    sidecarLabel: grafana_dashboard
    # Namespace where Grafana is deployed (for cross-namespace discovery)
    # Leave empty to create in the release namespace
    namespace: ""
    # Additional labels for the ConfigMap
    labels: {}
    # Additional annotations for the ConfigMap
    annotations: {}
```

Add template `charts/gatekeeperd/templates/grafana-dashboard.yaml`:

```yaml
{{- if .Values.grafana.dashboard.enabled }}
apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ include "gatekeeperd.fullname" . }}-grafana-dashboard
  {{- if .Values.grafana.dashboard.namespace }}
  namespace: {{ .Values.grafana.dashboard.namespace }}
  {{- end }}
  labels:
    {{- include "gatekeeperd.labels" . | nindent 4 }}
    {{ .Values.grafana.dashboard.sidecarLabel }}: "1"
    {{- with .Values.grafana.dashboard.labels }}
    {{- toYaml . | nindent 4 }}
    {{- end }}
  {{- with .Values.grafana.dashboard.annotations }}
  annotations:
    {{- toYaml . | nindent 4 }}
  {{- end }}
data:
  gatekeeperd.json: |-
    {{ .Files.Get "dashboards/gatekeeperd.json" | nindent 4 }}
{{- end }}
```

### 3. ServiceMonitor for Prometheus Operator (optional, related)

While we're adding observability features, consider also adding a ServiceMonitor for Prometheus Operator users. This is separate from the dashboard but often requested together:

```yaml
serviceMonitor:
  enabled: false
  # Namespace for the ServiceMonitor (defaults to release namespace)
  namespace: ""
  # Interval for scraping metrics
  interval: 30s
  # Additional labels for ServiceMonitor (e.g., for Prometheus selection)
  labels: {}
```

## Dashboard Panels

The dashboard should include panels for:

**Overview Row**
- Request rate (total requests/sec)
- Success rate (2xx/3xx percentage)
- Error rate (4xx/5xx)
- Active relay clients

**Request Metrics Row**
- Requests by hostname (stacked area)
- Requests by status code (stacked bar)
- Request latency (p50, p95, p99)
- Request latency heatmap

**Security Row**
- Verification failures by verifier and reason
- IP filter denials by allowlist
- Validation failures

**Relay Row** (if relay is used)
- Webhooks queued vs delivered
- Relay delivery latency
- Delivery errors by reason
- Pending queue depth (Redis mode)
- Connected clients per token

**System Row**
- IP ranges loaded per allowlist
- IP range fetch errors

## Variables

Dashboard should include template variables:
- `datasource` - Prometheus datasource selector
- `hostname` - Filter by webhook hostname
- `namespace` - Kubernetes namespace (for multi-tenant)
- `instance` - Pod instance selector

## Alternatives Considered

**Separate Helm chart for dashboard**
- Overkill; a single ConfigMap doesn't warrant a separate chart

**Grafana API provisioning**
- Requires Grafana credentials
- Not the Kubernetes-native approach
- Less portable

**Dashboard embedded in docs only**
- Harder to keep in sync with metrics changes
- No automatic provisioning

## Acceptance Criteria

- [ ] Dashboard JSON file at `dashboards/grafana-gatekeeperd.json`
- [ ] Dashboard covers all metrics from `internal/metrics/metrics.go`
- [ ] Helm values for enabling dashboard ConfigMap
- [ ] ConfigMap template with sidecar label
- [ ] Dashboard uses variables for datasource, hostname, namespace
- [ ] Documentation in README or docs/ explaining how to use
- [ ] (Optional) ServiceMonitor template for Prometheus Operator

## Notes

- The Grafana sidecar approach requires Grafana to be configured with sidecar enabled (this is the default in kube-prometheus-stack)
- For cross-namespace dashboard discovery, Grafana's sidecar needs RBAC to list ConfigMaps in other namespaces
- Dashboard JSON should be validated with Grafana's dashboard schema


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Grafana dashboard for monitoring #15

feat: Add Grafana dashboard for monitoring

Motivation

Proposed Approach

1. Dashboard JSON file (all users)

2. Helm chart ConfigMap with sidecar label (Kubernetes users)

3. ServiceMonitor for Prometheus Operator (optional, related)

Dashboard Panels

Variables

Alternatives Considered

Acceptance Criteria

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Layer	What it provides	Who uses it
Dashboard JSON file	Source of truth, manual import	Everyone
Helm ConfigMap	Auto-provisioning via Grafana sidecar	Kubernetes + Grafana Helm chart
ServiceMonitor	Auto-discovery of metrics endpoint	Kubernetes + Prometheus Operator

feat: Add Grafana dashboard for monitoring #15

Description

feat: Add Grafana dashboard for monitoring

Motivation

Proposed Approach

1. Dashboard JSON file (all users)

2. Helm chart ConfigMap with sidecar label (Kubernetes users)

3. ServiceMonitor for Prometheus Operator (optional, related)

Dashboard Panels

Variables

Alternatives Considered

Acceptance Criteria

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions