Add real-time observability dashboard at /observe#27
Add real-time observability dashboard at /observe#27oelor wants to merge 4 commits intogrp06:mainfrom
Conversation
When the gateway is busy processing cron jobs (~60% of the time), the initial WebSocket handshake times out and the Studio shows "No agents available" with no recovery. Add exponential backoff retry (2s → 30s, up to 20 attempts) that automatically reconnects when the gateway becomes available. Auth errors are not retried. Manual disconnect suppresses auto-retry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New standalone page that subscribes to ALL gateway events (including cron jobs and isolated sessions) without session-key filtering. Provides live activity feed, session status cards with origin badges, and intervention alerts for errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 727321ff67
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (!didAutoConnect.current) return; | ||
| if (wasManualDisconnectRef.current) return; | ||
| if (!gatewayUrl.trim()) return; | ||
| if (isAuthError(error)) return; |
There was a problem hiding this comment.
Detect credential failures before auto-retrying reconnects
useGatewayConnection intends to suppress retries for auth problems via isAuthError(error), but failed handshakes currently surface as the generic message Gateway closed (4008): connect failed, so this guard never matches real invalid-token/forbidden cases. I traced this to GatewayBrowserClient.sendConnect, which catches the connect RPC failure and always closes the socket with reason "connect failed" (see src/lib/gateway/openclaw/GatewayBrowserClient.ts around the catch at lines 505-510), so the new retry loop now runs up to 20 times even for credential errors instead of stopping immediately.
Useful? React with 👍 / 👎.
…context - Activity feed now shows plain English descriptions instead of raw protocol events: "Reading LEARNINGS.md", "Running: git status", "Spawning subagent" - Added LiveOutputPanel showing streaming text from the active session - Session cards now display current activity, tool name, streaming preview - Header shows high-level summary of what's running across all sessions - Fixed UNKNOWN origin badges — infers cron/interactive from session keys - Studio link opens in new tab to avoid navigating away from /observe - Filtered out noisy heartbeat/presence/delta events from feed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Added /api/observe/context endpoint reading filesystem data: recent memory summaries, active initiatives, task queue - Left panel: session cards + cron schedule with next run times - Center: live output when active, last session preview when idle - Right panel: strategic focus areas from INITIATIVES.md + recent memory showing topics, actions, and tools from hourly summaries - Cron jobs loaded via gateway cron.list, auto-refresh every 30s - Session previews via sessions.preview for recent activity context - Human-readable activity feed with tool call descriptions - Live streaming text panel for active sessions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
/observeroute providing real-time visibility into all gateway events (including cron jobs and isolated sessions)Technical Details
useGatewayConnection,createRafBatcher,classifyGatewayEventKind,parseAgentIdFromSessionKeysessions.listwithincludeGlobal: trueTest plan
tsc --noEmit— cleaneslint .— cleanvitest run— all 233 tests pass/observeand verify live events appear during cron job execution🤖 Generated with Claude Code