Skip to content

Latest commit

 

History

History
274 lines (202 loc) · 9.92 KB

File metadata and controls

274 lines (202 loc) · 9.92 KB

Hydra: Local-First Dev Observability

Problem Statement

Developers are blind when running services locally.

Production has Datadog, Grafana, fancy dashboards. Local dev has println and docker logs -f.

The observability gap between prod and local is massive:

  • No traces across local microservices
  • No correlated logs between containers
  • No metrics unless you set up Prometheus yourself
  • No way to answer "why is this slow?" without adding debug statements

Setting up local observability today means:

  • Installing Jaeger, Prometheus, Grafana, Loki
  • Writing config files for each
  • Configuring exporters in every service
  • Managing 5+ containers just to see what's happening

Most developers give up and just use print statements.


Solution

A single binary that gives you full observability for local development.

# That's it. You're done.
docker run -p 8080:8080 -p 4317:4317 hydra

Open localhost:8080. See logs, traces, and metrics from all your services. Zero config.


Core Principles

  1. Zero config by default Works out of the box. No YAML. No setup wizards. Just run it.

  2. Single container One image. One port for UI. One port for OTLP. Nothing else.

  3. Ephemeral by design Data lives in memory. Restart clears everything. This is a feature, not a bug.

  4. Auto-correlation Logs, traces, and metrics are linked automatically. Click a trace, see the logs. Click an error, see the span.

  5. Framework auto-detection Detects common frameworks and shows relevant views. Spring Boot? Show HTTP endpoints. Express? Show routes.

  6. Fast startup Under 2 seconds to ready. Developers restart constantly. Every second matters.


Non-Goals

  • Production deployment
  • Long-term storage
  • Horizontal scaling
  • Multi-tenant support
  • Alerting
  • Dashboards you configure

This is a dev tool. It dies when you close your laptop. That's fine.


User Experience

Starting

# Option 1: Docker
docker run -p 8080:8080 -p 4317:4317 ghcr.io/hydra/hydra

# Option 2: Binary
hydra

# Option 3: In your docker-compose
services:
  hydra:
    image: ghcr.io/hydra/hydra
    ports:
      - "8080:8080"
      - "4317:4317"

Instrumenting Your App

Point your OTEL SDK at localhost:4317. Done.

# Python
os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = "http://localhost:4317"
// Node
process.env.OTEL_EXPORTER_OTLP_ENDPOINT = "http://localhost:4317"
// Java - system property
-Dotel.exporter.otlp.endpoint=http://localhost:4317

Or use auto-instrumentation agents that already exist.

The UI

Open localhost:8080:

┌─────────────────────────────────────────────────────────────┐
│  HYDRA                                    [Services ▼] [⚙]  │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  LIVE FEED                                                  │
│  ─────────────────────────────────────────────────────────  │
│  14:32:01 [api-gateway]     POST /checkout 200 142ms   [→]  │
│  14:32:01 [payment-svc]     Processing payment...      [→]  │
│  14:32:02 [payment-svc]     ERROR: Connection refused  [→]  │
│  14:32:02 [api-gateway]     POST /checkout 500 1.2s    [→]  │
│                                                             │
│  ─────────────────────────────────────────────────────────  │
│  [→] = Click to see full trace + related logs               │
│                                                             │
├─────────────────────────────────────────────────────────────┤
│  SERVICES (auto-discovered)                                 │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐        │
│  │ api-gateway  │ │ payment-svc  │ │ inventory    │        │
│  │ 142 req/min  │ │ 89 req/min   │ │ 34 req/min   │        │
│  │ 2 errors     │ │ 5 errors     │ │ 0 errors     │        │
│  └──────────────┘ └──────────────┘ └──────────────┘        │
└─────────────────────────────────────────────────────────────┘

Click any request → see the full trace with logs inline:

┌─────────────────────────────────────────────────────────────┐
│  TRACE: POST /checkout                                      │
│  ─────────────────────────────────────────────────────────  │
│                                                             │
│  api-gateway ████████████████████████████████████░░ 1.2s    │
│    ├─ LOG: Received checkout request user_id=123            │
│    │                                                        │
│    └─ payment-svc █████████████████████████████░░░ 1.1s     │
│         ├─ LOG: Processing payment amount=99.00             │
│         ├─ LOG: Connecting to payment gateway...            │
│         ├─ LOG: ERROR: Connection refused                   │
│         └─ LOG: Retrying... (attempt 2/3)                   │
│                                                             │
│  [Copy trace ID]  [Export as JSON]  [Show raw spans]        │
└─────────────────────────────────────────────────────────────┘

Architecture

┌─────────────────────────────────────────┐
│              Your Services              │
│  (with OTEL SDK pointing to :4317)      │
└──────────────────┬──────────────────────┘
                   │ OTLP (gRPC/HTTP)
                   ▼
┌─────────────────────────────────────────┐
│               HYDRA                     │
│  ┌───────────┐  ┌───────────────────┐   │
│  │  OTLP     │  │  In-Memory Store  │   │
│  │  Receiver │─▶│  - Traces         │   │
│  │  :4317    │  │  - Logs           │   │
│  └───────────┘  │  - Metrics        │   │
│                 │  - Correlations   │   │
│  ┌───────────┐  └─────────┬─────────┘   │
│  │  Web UI   │◀───────────┘             │
│  │  :8080    │  (WebSocket for live)    │
│  └───────────┘                          │
└─────────────────────────────────────────┘

All in one process. All in memory. Simple.


Technology Choices

  • Language: Rust
  • OTLP: opentelemetry-rust + tonic
  • Web server: axum
  • UI: Embedded SPA (Svelte or Solid, compiled into binary)
  • Storage: In-memory (bounded ring buffers)
  • WebSocket: Live streaming updates to UI

Data Model

Retention

  • Keep last N traces (default: 1000)
  • Keep last N logs per service (default: 10000)
  • Keep last 1 hour of metrics
  • Ring buffer eviction (oldest out)

Correlation

Every piece of data is linked:

Trace
  └─ Spans[]
       └─ Logs[] (by trace_id + span_id + time window)
       └─ Metrics[] (by service + time window)

Correlation happens at ingest, not query time.


MVP Scope

Phase 1: Core

  • OTLP gRPC receiver (traces + logs)
  • In-memory storage with ring buffers
  • Basic web UI (live feed, trace view)
  • Auto-correlation of logs to traces
  • Single binary, single container

Phase 2: Polish

  • OTLP HTTP receiver
  • Metrics support
  • Service dependency graph
  • Search/filter in UI
  • Dark mode (obviously)

Phase 3: DX

  • hydra init - generates docker-compose snippet
  • Framework detection (show relevant info per framework)
  • "Why is this slow?" button (highlights slow spans)
  • Export trace to share with teammate

Success Criteria

The project succeeds if:

  • A developer can go from zero to seeing traces in under 60 seconds
  • No documentation is needed for basic usage
  • It uses less than 200MB of memory for typical local dev
  • Developers actually prefer it over print statements

What This Is Not

  • Not a Jaeger replacement (Jaeger is for prod)
  • Not a Grafana replacement (Grafana is for dashboards)
  • Not a log aggregator (use Loki for that)
  • Not trying to scale (scaling is a non-goal)

This is observability for localhost. Nothing more. Nothing less.