feat: phased decision report for orchestrator explainability by WellDunDun · Pull Request #48 · selftune-dev/selftune

WellDunDun · 2026-03-14T17:22:56Z

Summary

Adds formatOrchestrateReport() — a 5-phase human-readable decision report showing sync sources, status breakdown, per-skill decisions with reasons, evolution results with validation pass-rate changes, and watch outcomes including alerts/rollbacks
Enriches JSON stdout with a decisions array containing per-skill action, reason, deployment status, validation scores, watch alerts, and rollback status
Adds mode banners (DRY RUN / REVIEW / AUTONOMOUS) with rerun hints
13 new tests covering all report phases, mode banners, empty states, rollback tags, and rerun hints (29 total, all passing)

Supersedes #45 — absorbs its formatOrchestrateReport concept but fixes two issues: PR #45 introduced a summary.autoApprove field that doesn't exist (the real field is approvalMode), and referenced the deprecated --auto-approve flag in mode banners.

What users can now see

Why each skill was considered (status, pass rate, missed queries)
Why it was skipped (healthy, filtered, capped, no agent CLI, no SKILL.md)
Whether evolution passed validation (with before/after pass rates)
Whether it was deployed
Whether watch detected regression, and whether rollback occurred

Test plan

All 29 orchestrate tests pass (1331 total across repo)
formatOrchestrateReport tested for all 3 modes, all 5 phases, empty states, rollback tags, rerun hints
Manual: selftune orchestrate --dry-run produces clear phased report on stderr
Manual: JSON stdout includes decisions array with per-skill reasoning

🤖 Generated with Claude Code

Deletes Conductor worktree branches (custom/prefix/router-*), selftune evolve test branches, and orphaned worktree-agent-* branches. Also prunes stale remote tracking refs. Run with `make clean-branches`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…w candidates Extends the composability analysis with positive interaction detection (synergy scores), ordered skill sequence extraction from usage timestamps, and automatic workflow candidate flagging. Backwards compatible — v1 function and tests unchanged, CLI falls back to v1 when no usage log exists. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Implements multi-skill workflow support: discovers workflow patterns from existing telemetry, displays them via `selftune workflows`, and codifies them to SKILL.md via `selftune workflows save`. Includes 48 tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…9613 Add workflow discovery and CLI command (v0.3)

BUG-1: Remove false-positive git hook checks, fix hook key names to PascalCase BUG-2: Auto-derive expectations from SKILL.md when none provided BUG-3: Add --help output to grade command documenting --session-id BUG-4: Prefer skills_invoked over skills_triggered in session matching BUG-5: Add pre-flight validation and human-readable errors to evolve BUG-6: Distinguish real Skill tool calls from SKILL.md browsing reads IMP-1: Confirmed templates/ in package.json files array IMP-2: Auto-install agent files during init IMP-3: Show UNGRADED instead of CRITICAL when no graded sessions exist IMP-4: Use portable npx selftune hook <name> instead of absolute paths IMP-5: Add selftune auto-grade command IMP-6: Mandate AskUserQuestion in evolve workflows IMP-7: Add selftune quickstart command Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Hook files guard execution behind import.meta.main, so dynamically importing them was a no-op. Spawn as subprocess instead so stdin payloads are processed and hooks write telemetry logs correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

## Walkthrough This PR adds new CLI commands (auto-grade, quickstart, hook), implements expectation auto-derivation from SKILL.md, distinguishes actual skill invocations from passive SKILL.md reads, adds an UNGRADED skill status, migrates hook checks to Claude Code settings.json, and enhances evolve CLI pre-flight validation, verbose logging, and post-run/error messaging. ## Changes |Cohort / File(s)|Summary| |---|---| |**CLI Routing & Init** `cli/selftune/index.ts`, `cli/selftune/init.ts`|Adds `auto-grade`, `quickstart`, and `hook` command routing; `installAgentFiles()` to deploy bundled agent files; expands Claude Code hook detection (4 keys) and supports nested hook entries.| |**Evolution CLI** `cli/selftune/evolution/evolve.ts`|Pre-flight checks for `--skill-path`/SKILL.md and eval-set paths, verbose config dump, enhanced post-run deployment/diagnostic messaging, and improved top-level error/troubleshooting output.| |**Automated Grading** `cli/selftune/grading/auto-grade.ts` (new), `cli/selftune/grading/grade-session.ts`|New auto-grade CLI; adds `deriveExpectationsFromSkill()` (exported) to extract up to 5 expectations from SKILL.md and uses it when explicit evals are missing; session selection now prefers `skills_invoked`.| |**Hook / Skill Eval** `cli/selftune/hooks/skill-eval.ts`, `cli/selftune/observability.ts`|Adds `hasSkillToolInvocation()` and updates `processToolUse` to set `triggered` based on detection; observability now checks Claude Code `settings.json` (flat and nested formats) instead of repo .git/hooks.| |**Telemetry & Transcript** `cli/selftune/types.ts`, `cli/selftune/utils/transcript.ts`, `cli/selftune/ingestors/claude-replay.ts`|Adds `skills_invoked` to telemetry/transcript types and populates it from tool_use entries; ingestion prefers `skills_invoked` and sets `triggered` accordingly (fallback to previous behavior).| |**Status / Badge** `cli/selftune/status.ts`, `cli/selftune/badge/badge-data.ts`|Introduces new SkillStatus value `UNGRADED` and extends BadgeData.status to include `UNGRADED`; status logic and display updated to treat skills with records but no triggered (graded) sessions as UNGRADED.| |**Quickstart Onboarding** `cli/selftune/quickstart.ts` (new)|Adds three-step quickstart flow (init, replay, status) with `suggestSkillsToEvolve()` ranking ungraded/critical/warning skills.| |**Eval set behavior** `cli/selftune/eval/hooks-to-evals.ts`|`buildEvalSet` now filters positives/negatives to only include records where `triggered === true`.| |**Templates & Docs** `templates/*-settings.json`, `skill/Workflows/*.md`|Replaces hardcoded path-based hook invocations with `npx selftune hook <name>`; updates Evolve workflow docs to require AskUserQuestion for user inputs.| |**Tests & Fixtures** `tests/*` (multiple)|Adds/updates tests for derived expectations, skills_invoked precedence, triggered=true/false cases, observability hook formats, UNGRADED state, and eval positives filtering; updates SKILL.md fixtures.| ## Sequence Diagram(s) ```mermaid sequenceDiagram participant User participant CLI as auto-grade CLI participant Telemetry as Telemetry Log participant SkillMD as SKILL.md participant PreGates as Pre-Gates participant Agent as LLM Agent participant Output as JSON Output User->>CLI: auto-grade --skill X [--session-id Y] CLI->>Telemetry: Load telemetry log Telemetry-->>CLI: Sessions list CLI->>CLI: Resolve session (by ID or latest using skills_invoked) CLI->>SkillMD: Derive expectations (if needed) SkillMD-->>CLI: Expectations (derived or none) CLI->>PreGates: Run pre-gates on expectations alt All resolved PreGates-->>CLI: All expectations resolved else Some unresolved PreGates-->>CLI: Unresolved list CLI->>Agent: gradeViaAgent(unresolved) Agent-->>CLI: Graded expectations end CLI->>CLI: Compile results, metrics, summary CLI->>Output: Write GradingResult JSON Output-->>User: Summary + file path ``` ```mermaid sequenceDiagram participant User participant CLI as quickstart CLI participant Config as Config Check participant Replay as Replay Ingestor participant Telemetry as Telemetry Log participant Status as Status Engine participant Skills as Skills Scorer User->>CLI: quickstart CLI->>Config: Check config/marker alt Config missing CLI->>CLI: runInit() end CLI->>Replay: Check replay marker alt Marker missing Replay->>CLI: Ingest transcripts (parseTranscript -> skills_invoked) CLI->>Telemetry: Update logs/marker end CLI->>Status: Compute skill statuses (uses triggered & pass rates) Status-->>CLI: Formatted status CLI->>Skills: suggestSkillsToEvolve() Skills-->>CLI: Top ranked skills CLI->>User: Display suggestions ``` ```mermaid sequenceDiagram participant Transcript as File participant Parser as parseTranscript participant Detector as Invocation Detector participant Telemetry as Session Record participant Ingestor as claude-replay Parser->>Transcript: Read JSONL lines loop each message Parser->>Detector: Check tool_use entries alt tool_use.toolName == "Skill" Detector-->>Parser: Extract skill name Parser->>Parser: Add to skills_invoked[] else Detector-->>Parser: ignore end end Parser->>Telemetry: Emit TranscriptMetrics (skills_invoked) Ingestor->>Telemetry: Create SkillUsageRecord (use skills_invoked if present) ``` ## Estimated code review effort 🎯 4 (Complex) | ⏱️ ~60 minutes ## Possibly related PRs - **#23**: Overlaps evolve CLI changes in `cli/selftune/evolution/evolve.ts` (pre-flight, logging, messaging). - **#13**: Related telemetry/ingestion changes around `skills_invoked` and session selection behavior. - **#17**: Touches the same ingestor surface (`cli/selftune/ingestors/claude-replay.ts`) and skill-invocation/trigger semantics.

* Update selftune workflow docs and skill versioning * Improve selftune skill portability and setup docs * Clarify workflow doc edge cases * Fix OpenClaw doctor validation and workflow docs * Polish composability and setup docs

* Fix BUG-7, BUG-8, BUG-9 from demo findings BUG-7: Add try/catch + array validation around eval-set file loading in evolve() so parse errors surface as user-facing messages instead of silent exit. BUG-8: Add cold-start bootstrap — when extractFailurePatterns returns empty but the eval set has positive entries, treat those positives as missed queries so evolve can work on skills with zero usage history. BUG-9: Add --out flag to evals CLI parseArgs as alias for --output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix evolve CI regressions * Isolate blog proof fixture mutations --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Fix dashboard export and layout * Improve telemetry normalization groundwork * Add test runner state * Separate and extract telemetry contract * Fix telemetry CI lint issues * Fix remaining CI regressions * Detect zero-trigger monitoring regressions * Stabilize dashboard report route tests * Address telemetry review feedback

* Fix telemetry follow-up edge cases * Fix rollback payload and Codex prompt attribution * Tighten Codex rollout prompt tracking

* Prepare 0.2.1 release * Update README install path * Use trusted publishing for npm

* feat: consume @selftune/telemetry-contract as workspace package Replace relative path imports of telemetry-contract with the published @selftune/telemetry-contract workspace package. Adds workspace config to package.json and expands tsconfig includes to cover packages/*. Closes SEL-10 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(telemetry-contract): add versioning, metadata, and golden fixtures Add version 1.0.0 and package metadata (description, author, license, repository) to the telemetry-contract package. Create golden fixture file with one valid example per record kind and a test suite that validates all fixtures against the contract validator. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Make selftune source-truth driven * Harden live dashboard loading * Audit cleanup: test split, docs, lint fixes - Add make test-fast / test-slow targets (5s vs 80s, 16x faster dev loop) - Add bun run test:fast / test:slow scripts in package.json - Reposition README as "Claude Code first", update competitive comparison - Bump PRD.md version to 0.2.1 - Add CHANGELOG unreleased section (source-truth, telemetry-contract, test split) - Fix pre-existing lint: types.ts formatting, golden.test.ts import order Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add sync flags and hook dispatch to integration guide - Document all selftune sync flags (--since, --dry-run, --force, etc.) - Add selftune hook dispatch command with all 6 hook names - Verified init, activation rules, and source-truth sections already current Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Harden LLM calls and fix pre-existing test failures Add exponential backoff retry to callViaAgent for transient subprocess failures. Cap JSONL health-check validation at 500 lines to prevent timeouts on large log files. Use exported DEFAULT_WINDOW_SESSIONS constant in dashboard data collection instead of telemetry.length. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add SQLite materialization layer for dashboard queries Add a local SQLite database (via bun:sqlite) as an indexed materialized view store so the dashboard/report UX no longer depends on recomputing everything from raw JSONL logs on every request. New module at cli/selftune/localdb/ with: - schema.ts: 10 tables + 19 indexes mirroring canonical telemetry and local log shapes - db.ts: openDb() lifecycle with WAL mode, meta key-value helpers - materialize.ts: full rebuild and incremental materialization from JSONL source-of-truth logs - queries.ts: getOverviewPayload(), getSkillReportPayload(), getSkillsList() query helpers Raw JSONL logs remain authoritative — the DB is a disposable cache that can always be rebuilt. No new npm dependencies (bun:sqlite only). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve Biome lint and format errors Auto-fix import ordering, formatting, and replace non-null assertions with optional chaining in tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add selftune orchestrate command for autonomous core loop Introduces `selftune orchestrate` — a single entry point that chains sync → status → evolve → watch into one coordinated run. Defaults to dry-run mode with explicit --auto-approve for deployments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review — lint errors, logic bug, and type completeness - Replace string concatenation with template literals (Biome lint) - Add guard in evolve loop for agent-missing skip mutations - Replace non-null assertion with `as string` cast - Remove unused EvolutionAuditEntry import - Complete DoctorResult mock with required fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: apply Biome formatting and import sorting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Bumps [oven-sh/setup-bun](https://github.com/oven-sh/setup-bun) from 2.1.2 to 2.1.3. - [Release notes](https://github.com/oven-sh/setup-bun/releases) - [Commits](oven-sh/setup-bun@3d26778...ecf28dd) --- updated-dependencies: - dependency-name: oven-sh/setup-bun dependency-version: 2.1.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [actions/setup-node](https://github.com/actions/setup-node) from 6.2.0 to 6.3.0. - [Release notes](https://github.com/actions/setup-node/releases) - [Commits](actions/setup-node@6044e13...53b8394) --- updated-dependencies: - dependency-name: actions/setup-node dependency-version: 6.3.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.32.4 to 4.32.6. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@89a39a4...0d579ff) --- updated-dependencies: - dependency-name: github/codeql-action dependency-version: 4.32.6 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Improve sync progress and tighten query filtering * Fix biome formatting errors in sync.ts and query-filter.test.ts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add generic scheduling command and reposition OpenClaw cron as optional The primary automation story is now agent-agnostic. `selftune schedule` generates ready-to-use snippets for system cron, macOS launchd, and Linux systemd timers. `selftune cron` is repositioned as an optional OpenClaw integration rather than the main automation path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review — centralize schedule data, fix generators and formatting Derive SCHEDULE_ENTRIES from DEFAULT_CRON_JOBS (single source of truth), generate launchd/systemd configs for all 4 entries instead of sync-only, fix biome formatting, and add markdown language tag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use StartCalendarInterval for fixed-time launchd and shell wrappers for chained commands - launchd: use StartCalendarInterval (Hour/Minute/Weekday) for fixed-time schedules instead of approximating with StartInterval - launchd/systemd: use /bin/sh -c wrapper for commands with && chains so prerequisite steps (like sync) are not silently dropped Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add local dashboard SPA with React + Vite Introduces a minimal React SPA at apps/local-dashboard/ with two routes: overview (KPIs, skill health grid, evolution feed) and per-skill drilldown (pass rate, invocation breakdown, evaluation records). Consumes existing dashboard-server API endpoints with SSE live updates, explicit loading/ error/empty states, and design tokens matching the current dashboard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review feedback for local dashboard Extract shared utils (deriveStatus, formatRate, timeAgo), add SSE exponential backoff with max retries, filter ungraded skills from avg pass rate, fix stuck loading state for undefined skillName, use word-boundary regex for evolution filtering, add focus-visible styles, add typecheck script, and add Vite env types. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address second round CodeRabbit review feedback Cancel pending SSE reconnect timers on cleanup, add stale-request guard to useSkillReport, remove redundant decodeURIComponent (React Router already decodes), quote font names in CSS for stylelint, and format deriveStatus signature for Biome. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: align local dashboard SPA with SQLite v2 data architecture Migrate SPA from old JSONL-reading /api/data endpoints to new SQLite-backed /api/v2/* endpoints. Add v2 server routes for overview and per-skill reports. Replace SSE with 15s polling. Rewrite types to match materialized query shapes from queries.ts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review feedback on local dashboard SPA - Add language identifier to HANDOFF.md fenced code block (MD040) - Prevent overlapping polls in useOverview with in-flight guard and sequential setTimeout - Broaden empty-state check in useSkillReport to include evolution/proposals - Fix Sessions KPI to use counts.sessions instead of counts.telemetry - Wrap materializeIncremental in try/catch to preserve last good snapshot on failure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: sort imports to satisfy Biome organizeImports lint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit nitpicks — cross-platform dev script, stricter types, CSS compat - Use concurrently for cross-platform dev script instead of shell backgrounding - Tighten Sidebar counts prop to Partial<Record<SkillHealthStatus, number>> - Replace color-mix() with rgba fallback for broader browser support Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add UNKNOWN status filter and extract header height CSS variable - Add UNKNOWN to STATUS_OPTIONS so all SkillHealthStatus values are filterable - Extract hardcoded 56px header height to --header-h CSS variable Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: hoist sidebar collapse state to layout and add UNKNOWN filter style - Lift collapsed state from Sidebar to Overview so grid columns resize properly - Add .sidebar-collapsed grid rules at all breakpoints - Fix mobile: collapsed sidebar no longer creates dead-end (shows inline) - Add .filter-pill.active.filter-unknown CSS rule for UNKNOWN status Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: serve SPA as default dashboard, legacy at /legacy/ - Dashboard server now serves built SPA from apps/local-dashboard/dist/ at / - Legacy dashboard moved to /legacy/ route - SPA fallback for client-side routes (e.g. /skills/:name) - Static asset serving with content-hashed caching for /assets/* - Path traversal protection on static file serving - Add build:dashboard script to root package.json - Include apps/local-dashboard/dist/ in published files - Falls back to legacy dashboard if SPA build not found Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add shadcn theming with dark/light toggle and selftune branding Migrate dashboard to shadcn theme system with proper light/dark support. Dark mode uses selftune site colors (navy/cream/copper), light mode uses standard shadcn defaults. Add ThemeProvider with localStorage persistence, sun/moon toggle in site header, and SVG logo with currentColor for both themes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: path traversal check and 404 for missing skills Use path.relative() + isAbsolute() instead of startsWith() for the SPA static asset path check to prevent directory traversal bypass. Return 404 from /api/v2/skills/:name when the skill has no usage data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: biome formatting — semicolons, import order, line length Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review — dedupe polling, fix stale closures, harden theme/config - Lift useOverview to DashboardShell, pass as prop to Overview (no double polling) - Fix stale closure in drag handler by deriving indices from prev state - Validate localStorage theme values, use undefined context default - Add relative positioning to theme toggle button for MoonIcon overlay - Fix falsy check hiding zero values in chart tooltip - Fix invalid Tailwind selectors in dropdown-menu and toggle-group - Use ESM-safe fileURLToPath instead of __dirname in vite.config - Switch manualChunks to function form for Base UI subpath matching - Align pass-rate threshold with deriveStatus in SkillReport - Use local theme provider in sonner instead of next-themes - Add missing React import in skeleton, remove unused Separator import - Include vite.config.ts in tsconfig for typecheck coverage - Fix inconsistent JSX formatting in select scroll buttons Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review round 2 — shared sorting, DnD fixes, Tailwind v4 migration - Extract sortByPassRateAndChecks to utils.ts, dedupe sorting in App + Overview - Derive DnD dataIds from row model (not raw data), guard against -1 indexOf - Hide pagination when table is empty instead of showing "Page 1 of 0" - Fix ActivityTimeline default tab to prefer non-empty dataset - Import ReactNode directly instead of undeclared React namespace - Quote CSS attribute selector in chart style injection - Use stable composite keys for tooltip and legend items - Remove unnecessary "use client" directive from dropdown-menu (Vite SPA) - Migrate outline-none to outline-hidden for Tailwind v4 accessibility - Fix toggle-group orientation selectors to match data-orientation attribute - Add missing CSSProperties import in sonner.tsx - Add dark mode variant for SkillReport row highlight - Format vite.config.ts with Biome Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add evidence viewer, evolution timeline, and enhanced skill report Add EvidenceViewer, EvolutionTimeline, and InfoTip components. Enhance SkillReport with richer data display, expand dashboard server API endpoints, and update documentation and architecture docs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review round 3 — DnD/sort conflict, theme listener, formatting - Disable DnD reorder when table sorting is active (skill-health-grid) - Listen for OS theme preference changes when system theme is active - Apply Biome formatting to sortByPassRateAndChecks - Remove unused useEffect import from Overview - Deduplicate confidence filter in SkillReport - Materialize session IDs once in dashboard-server to avoid repeated subqueries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: show selftune version in sidebar footer Pass version from API response through to AppSidebar and display it dynamically instead of hardcoded "dashboard v0.1". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: biome formatting in dashboard-server — line length wrapping Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review round 4 — dedupe formatRate, STATUS_CONFIG, cleanup - Remove duplicate formatRate from app-sidebar, import from @/utils - Extract STATUS_CONFIG to shared @/constants module, import in both skill-health-grid and SkillReport - Remove misleading '' fallback from sessionPlaceholders since the ternary guards already skip queries when empty Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove redundant items prop from Select to avoid duplication The SelectItem children already define the options; the items prop was duplicating them unnecessarily. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add sortableKeyboardCoordinates to KeyboardSensor for proper keyboard DnD Without this, keyboard navigation moves by pixels instead of jumping between sortable items. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: Linear-style dashboard UX — collapsible sidebar, direct skill links, scope grouping - Simplify sidebar: remove status filters, keep logo + search + skills list - Add collapsible scope groups (Project/Global) using base-ui Collapsible - Surface skill_scope from DB query through API to dashboard types - Replace skill drawer with direct Link navigation to skill report - Add Scope column to skills table with filter dropdown - Slim down site header: remove breadcrumbs, reduce to sidebar trigger + theme toggle - Add side-by-side grid layout: skills table left, activity panel right - Gitignore pnpm-lock.yaml alongside bun.lock Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review — accessibility, semantics, state reset - Remove bun.lock from .gitignore to maintain build reproducibility - Preserve unexpected scope values in sidebar (don't drop unrecognized scopes) - Add aria-label to skill search input for screen reader accessibility - Switch status filter from checkbox to radio-group semantics (mutually exclusive) - Reset selectedProposal when navigating between skills via useEffect on name Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add TanStack Query and optimize SQL queries for dashboard performance Migrate data fetching from manual polling/dedup hooks to TanStack Query for instant cached navigation, background refetch, and request dedup. Optimize SQL: replace NOT IN subqueries with LEFT JOIN, move JS dedup to GROUP BY, add LIMIT 200 to unbounded evidence queries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: track root bun.lock for reproducible installs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review — collapsible sync, drag handle dedup, a11y, not-found heuristic - Make sidebar Collapsible controlled so it auto-opens when active skill changes (Comment #1) - Consolidate useSortable to single call per row via React context, use setActivatorNodeRef on drag handle button (Comment #2) - Remove capitalize CSS transform on free-form scope values (Comment #3) - Broaden isNotFound heuristic to check invocations, prompts, sessions in addition to evals/evolution/proposals (Comment #4) - Move Tooltip outside TabsTrigger to avoid nested interactive elements, use Base UI render prop for composition (Comment #5) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit nitpicks — version pinning, changelog clarity, shared query helper - Use caret range for recharts version (^2.15.4) for consistency - Clarify changelog: SSE was removed, polling via refetchInterval is primary - Extract getPendingProposals() shared helper in queries.ts, used by both getOverviewPayload() and dashboard-server skill report endpoint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit round 3 — deps, async fs, type safety, deterministic query - Move @tailwindcss/vite, tailwindcss, shadcn to devDependencies - Fix trailing space in version display when version is empty - Type caught error as unknown in refreshV2Data - Replace sync fs (readFileSync/statSync) with Bun.file() for hot-path asset serving - Return 404 for missing /assets/* files instead of falling through to SPA - Add details and eval_set fields to SkillReportPayload.evidence type - Fix nondeterministic GROUP BY with ROW_NUMBER() CTE in getPendingProposals Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve Biome lint and format errors in CI - Replace non-null assertion with type cast in useSkillReport (noNonNullAssertion) - Break long import line in dashboard-server.ts to satisfy Biome formatter Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit round 4 — CTE subqueries, type alignment, scope index - Replace dynamic bind-parameter expansion with CTE subquery for session lookups - Add skill_name to OverviewPayload.pending_proposals type to match runtime shape - Add composite index on skill_usage(skill_name, skill_scope, timestamp) for scope lookups Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit round 5 — startup guard, 404 heuristic, deterministic tiebreaker - Guard initial v2 materialization with try/catch to avoid full server crash - Include evidence in not-found check so evidence-only skills aren't 404'd - Add ea.id DESC tiebreaker to ROW_NUMBER() for deterministic pending proposals Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit round 6 — db guard, refresh throttle, deferred 404 - Guard openDb() in try/catch so DB bootstrap failure doesn't crash server - Make db nullable, return 503 from /api/v2/* when store is unavailable - Throttle failed refresh attempts with separate lastV2RefreshAttemptAt timestamp - Move skill 404 check after enrichment queries (evolution, proposals, invocations) - Use optional chaining for db.close() on shutdown Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Promote product planning docs * Add execution plans for product gaps and evals * Prepare SPA dashboard release path * Remove legacy dashboard runtime * Refresh execution plans after dashboard cutover * Build dashboard SPA in CI and publish * Refresh README for SPA release path * Address dashboard release review comments * Fix biome lint errors in dashboard tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Make autonomous loop the default scheduler path * Document orchestrate as the autonomous loop * Document autonomy-first setup path * Harden autonomous scheduler install paths * Clarify sync force usage in README --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Orchestrate output now explains each decision clearly so users can trust the autonomous loop. Adds formatOrchestrateReport() with 5-phase human report (sync, status, decisions, evolution, watch) and enriched JSON with per-skill decisions array. Supersedes PR #45. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-03-14T17:23:06Z

Important

Review skipped

Too many files!

This PR contains 223 files, which is 73 over the limit of 150.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d7fd61fa-649a-455b-bd0c-56e790b81884

📥 Commits

Reviewing files that changed from the base of the PR and between 88efd12 and 918eeb2.

⛔ Files ignored due to path filters (6)

apps/local-dashboard/bun.lock is excluded by !**/*.lock
apps/local-dashboard/package-lock.json is excluded by !**/package-lock.json
apps/local-dashboard/public/favicon.png is excluded by !**/*.png
apps/local-dashboard/public/logo.png is excluded by !**/*.png
apps/local-dashboard/public/logo.svg is excluded by !**/*.svg
bun.lock is excluded by !**/*.lock

📒 Files selected for processing (223)

.github/workflows/auto-bump-cli-version.yml
.github/workflows/ci.yml
.github/workflows/codeql.yml
.github/workflows/publish.yml
.github/workflows/scorecard.yml
.gitignore
ARCHITECTURE.md
CHANGELOG.md
Makefile
PRD.md
README.md
ROADMAP.md
apps/local-dashboard/.gitignore
apps/local-dashboard/HANDOFF.md
apps/local-dashboard/components.json
apps/local-dashboard/index.html
apps/local-dashboard/package.json
apps/local-dashboard/src/App.tsx
apps/local-dashboard/src/api.ts
apps/local-dashboard/src/components/ActivityTimeline.tsx
apps/local-dashboard/src/components/EvidenceViewer.tsx
apps/local-dashboard/src/components/EvolutionTimeline.tsx
apps/local-dashboard/src/components/InfoTip.tsx
apps/local-dashboard/src/components/app-sidebar.tsx
apps/local-dashboard/src/components/section-cards.tsx
apps/local-dashboard/src/components/site-header.tsx
apps/local-dashboard/src/components/skill-health-grid.tsx
apps/local-dashboard/src/components/theme-provider.tsx
apps/local-dashboard/src/components/theme-toggle.tsx
apps/local-dashboard/src/components/ui/avatar.tsx
apps/local-dashboard/src/components/ui/badge.tsx
apps/local-dashboard/src/components/ui/breadcrumb.tsx
apps/local-dashboard/src/components/ui/button.tsx
apps/local-dashboard/src/components/ui/card.tsx
apps/local-dashboard/src/components/ui/chart.tsx
apps/local-dashboard/src/components/ui/checkbox.tsx
apps/local-dashboard/src/components/ui/collapsible.tsx
apps/local-dashboard/src/components/ui/drawer.tsx
apps/local-dashboard/src/components/ui/dropdown-menu.tsx
apps/local-dashboard/src/components/ui/input.tsx
apps/local-dashboard/src/components/ui/label.tsx
apps/local-dashboard/src/components/ui/select.tsx
apps/local-dashboard/src/components/ui/separator.tsx
apps/local-dashboard/src/components/ui/sheet.tsx
apps/local-dashboard/src/components/ui/sidebar.tsx
apps/local-dashboard/src/components/ui/skeleton.tsx
apps/local-dashboard/src/components/ui/sonner.tsx
apps/local-dashboard/src/components/ui/table.tsx
apps/local-dashboard/src/components/ui/tabs.tsx
apps/local-dashboard/src/components/ui/toggle-group.tsx
apps/local-dashboard/src/components/ui/toggle.tsx
apps/local-dashboard/src/components/ui/tooltip.tsx
apps/local-dashboard/src/constants.tsx
apps/local-dashboard/src/hooks/use-mobile.ts
apps/local-dashboard/src/hooks/useOverview.ts
apps/local-dashboard/src/hooks/useSkillReport.ts
apps/local-dashboard/src/lib/utils.ts
apps/local-dashboard/src/main.tsx
apps/local-dashboard/src/pages/Overview.tsx
apps/local-dashboard/src/pages/SkillReport.tsx
apps/local-dashboard/src/styles.css
apps/local-dashboard/src/types.ts
apps/local-dashboard/src/utils.ts
apps/local-dashboard/src/vite-env.d.ts
apps/local-dashboard/tsconfig.json
apps/local-dashboard/vite.config.ts
biome.json
cli/selftune/badge/badge-data.ts
cli/selftune/badge/badge.ts
cli/selftune/canonical-export.ts
cli/selftune/constants.ts
cli/selftune/cron/setup.ts
cli/selftune/dashboard-contract.ts
cli/selftune/dashboard-server.ts
cli/selftune/dashboard.ts
cli/selftune/eval/baseline.ts
cli/selftune/eval/composability-v2.ts
cli/selftune/eval/hooks-to-evals.ts
cli/selftune/evolution/evidence.ts
cli/selftune/evolution/evolve-body.ts
cli/selftune/evolution/evolve.ts
cli/selftune/evolution/extract-patterns.ts
cli/selftune/grading/auto-grade.ts
cli/selftune/grading/grade-session.ts
cli/selftune/grading/results.ts
cli/selftune/hooks/prompt-log.ts
cli/selftune/hooks/session-stop.ts
cli/selftune/hooks/skill-eval.ts
cli/selftune/index.ts
cli/selftune/ingestors/claude-replay.ts
cli/selftune/ingestors/codex-rollout.ts
cli/selftune/ingestors/codex-wrapper.ts
cli/selftune/ingestors/openclaw-ingest.ts
cli/selftune/ingestors/opencode-ingest.ts
cli/selftune/init.ts
cli/selftune/last.ts
cli/selftune/localdb/db.ts
cli/selftune/localdb/materialize.ts
cli/selftune/localdb/queries.ts
cli/selftune/localdb/schema.ts
cli/selftune/monitoring/watch.ts
cli/selftune/normalization.ts
cli/selftune/observability.ts
cli/selftune/orchestrate.ts
cli/selftune/quickstart.ts
cli/selftune/repair/skill-usage.ts
cli/selftune/schedule.ts
cli/selftune/status.ts
cli/selftune/sync.ts
cli/selftune/types.ts
cli/selftune/utils/canonical-log.ts
cli/selftune/utils/hooks.ts
cli/selftune/utils/html.ts
cli/selftune/utils/llm-call.ts
cli/selftune/utils/math.ts
cli/selftune/utils/query-filter.ts
cli/selftune/utils/skill-discovery.ts
cli/selftune/utils/skill-log.ts
cli/selftune/utils/skill-usage-confidence.ts
cli/selftune/utils/transcript.ts
cli/selftune/workflows/discover.ts
cli/selftune/workflows/skill-md-writer.ts
cli/selftune/workflows/workflows.ts
dashboard/index.html
docs/design-docs/composability-v2.md
docs/design-docs/sandbox-claude-code.md
docs/design-docs/sandbox-test-harness.md
docs/design-docs/workflow-support.md
docs/escalation-policy.md
docs/exec-plans/active/grader-prompt-evals.md
docs/exec-plans/active/local-sqlite-materialization.md
docs/exec-plans/active/mcp-tool-descriptions.md
docs/exec-plans/active/multi-agent-sandbox.md
docs/exec-plans/active/product-reset-and-shipping.md
docs/exec-plans/active/telemetry-normalization.md
docs/exec-plans/completed/dashboard-spa-cutover.md
docs/exec-plans/reference/telemetry-field-map.md
docs/exec-plans/tech-debt-tracker.md
docs/integration-guide.md
package.json
packages/telemetry-contract/README.md
packages/telemetry-contract/fixtures/golden.json
packages/telemetry-contract/fixtures/golden.test.ts
packages/telemetry-contract/index.ts
packages/telemetry-contract/package.json
packages/telemetry-contract/src/index.ts
packages/telemetry-contract/src/types.ts
packages/telemetry-contract/src/validators.ts
skill/SKILL.md
skill/Workflows/Composability.md
skill/Workflows/Cron.md
skill/Workflows/Dashboard.md
skill/Workflows/Doctor.md
skill/Workflows/Evolve.md
skill/Workflows/EvolveBody.md
skill/Workflows/Initialize.md
skill/Workflows/Orchestrate.md
skill/Workflows/Schedule.md
skill/Workflows/Sync.md
skill/Workflows/Watch.md
skill/Workflows/Workflows.md
skill/assets/activation-rules-default.json
skill/assets/multi-skill-settings.json
skill/assets/single-skill-settings.json
skill/references/logs.md
skill/references/setup-patterns.md
skill/references/version-history.md
skill/settings_snippet.json
templates/multi-skill-settings.json
templates/single-skill-settings.json
test-results/.last-run.json
tests/blog-proof/fixtures/seo-audit/SKILL.md
tests/blog-proof/fixtures/seo-audit/SKILL.md.bak
tests/blog-proof/seo-audit-evolve.test.ts
tests/canonical-export.test.ts
tests/contribute/bundle.test.ts
tests/cron/setup.test.ts
tests/dashboard/badge-routes.test.ts
tests/dashboard/dashboard-server.test.ts
tests/dashboard/dashboard.test.ts
tests/eval/composability-v2.test.ts
tests/eval/hooks-to-evals.test.ts
tests/evolution/evidence.test.ts
tests/evolution/evolve-body.test.ts
tests/evolution/evolve.test.ts
tests/evolution/extract-patterns.test.ts
tests/grading/grade-session-flow.test.ts
tests/grading/grade-session.test.ts
tests/grading/results.test.ts
tests/hooks/prompt-log.test.ts
tests/hooks/session-stop.test.ts
tests/hooks/skill-eval.test.ts
tests/ingestors/claude-replay.test.ts
tests/ingestors/codex-rollout.test.ts
tests/ingestors/codex-wrapper.test.ts
tests/ingestors/openclaw-ingest.test.ts
tests/ingestors/opencode-ingest.test.ts
tests/init/init.test.ts
tests/localdb/localdb.test.ts
tests/monitoring/integration.test.ts
tests/monitoring/watch.test.ts
tests/normalization/normalization.test.ts
tests/observability.test.ts
tests/orchestrate.test.ts
tests/repair/skill-usage.test.ts
tests/sandbox/docker/run-openclaw-tests.ts
tests/sandbox/fixtures/openclaw/cron/jobs.json
tests/sandbox/run-sandbox.ts
tests/schedule/schedule.test.ts
tests/status/status.test.ts
tests/sync.test.ts
tests/telemetry-contract/validators.test.ts
tests/utils/canonical-log.test.ts
tests/utils/html.test.ts
tests/utils/llm-call.test.ts
tests/utils/query-filter.test.ts
tests/utils/skill-discovery.test.ts
tests/utils/skill-log.test.ts
tests/utils/transcript.test.ts
tests/workflows/discover.test.ts
tests/workflows/skill-md-writer.test.ts
tests/workflows/workflows.test.ts
tsconfig.json

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch custom/prefix/router-1773508846626

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

WellDunDun and others added 30 commits March 8, 2026 15:17

fix: address PR review feedback for workflows

d393b6b

fix: stabilize biome config for CI lint

91c2d35

fix: address new PR review threads

ab6ac9c

Merge pull request #30 from WellDunDun/custom/prefix/router-177304738…

7a48067

…9613 Add workflow discovery and CLI command (v0.3)

Address review comments

0d4265b

Fix lint CI failures

23ef652

Trigger CI rerun

ff1987f

Fix lint command resolution

eafab07

Address remaining review comments

967557a

Fix grade-session test isolation

89e06ce

Fix telemetry normalization edge cases (#34)

0d753ce

* Fix telemetry follow-up edge cases * Fix rollback payload and Codex prompt attribution * Tighten Codex rollout prompt tracking

Update npm package metadata (#35)

2601ac4

Prepare 0.2.1 release (#36)

0f176b3

* Prepare 0.2.1 release * Update README install path * Use trusted publishing for npm

WellDunDun and others added 3 commits March 14, 2026 07:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: phased decision report for orchestrator explainability#48

feat: phased decision report for orchestrator explainability#48
WellDunDun wants to merge 33 commits intomasterfrom
custom/prefix/router-1773508846626

WellDunDun commented Mar 14, 2026

Uh oh!

coderabbitai bot commented Mar 14, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

WellDunDun commented Mar 14, 2026

Summary

What users can now see

Test plan

Uh oh!

coderabbitai bot commented Mar 14, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant