Priority: P2
Perspective: Microservices / Distributed Systems Developer
Why
"I have 20 pods running the same service. I want to see which pod has the worst GC, which is leaking memory, and compare them side-by-side."
Design
# Discover all Argus-enabled pods via K8s API or mDNS
argus cluster scan --namespace=production
# Aggregated health view
argus cluster health
╭─ Cluster Health ────────────────────────────────────────────╮
│ Namespace: production Pods: 20/20 healthy │
│ │
│ Pod Heap% GC OH CPU Leak? VThreads │
│ order-svc-abc 72% 2.1% 34% No 1,234 │
│ order-svc-def 89% 8.3% 67% ⚠ Yes 2,456 │
│ order-svc-ghi 45% 0.8% 12% No 890 │
│ ... │
╰──────────────────────────────────────────────────────────────╯
# Compare two specific pods
argus cluster compare order-svc-abc order-svc-def
Implementation
- K8s API: list pods with
argus.io/enabled=true label, query each pod's /prometheus endpoint
- Non-K8s: manual target list via config file or mDNS discovery
- Aggregate metrics: min/max/avg/p99 across pods
Impact
Fills a major gap — no JVM CLI tool offers multi-instance aggregated diagnostics. This is the "production-scale" story.
Priority: P2
Perspective: Microservices / Distributed Systems Developer
Why
"I have 20 pods running the same service. I want to see which pod has the worst GC, which is leaking memory, and compare them side-by-side."
Design
Implementation
argus.io/enabled=truelabel, query each pod's/prometheusendpointImpact
Fills a major gap — no JVM CLI tool offers multi-instance aggregated diagnostics. This is the "production-scale" story.