feat: phased decision report for orchestrator explainability#48
Open
WellDunDun wants to merge 33 commits intomasterfrom
Open
feat: phased decision report for orchestrator explainability#48WellDunDun wants to merge 33 commits intomasterfrom
WellDunDun wants to merge 33 commits intomasterfrom
Conversation
Deletes Conductor worktree branches (custom/prefix/router-*), selftune evolve test branches, and orphaned worktree-agent-* branches. Also prunes stale remote tracking refs. Run with `make clean-branches`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…w candidates Extends the composability analysis with positive interaction detection (synergy scores), ordered skill sequence extraction from usage timestamps, and automatic workflow candidate flagging. Backwards compatible — v1 function and tests unchanged, CLI falls back to v1 when no usage log exists. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements multi-skill workflow support: discovers workflow patterns from existing telemetry, displays them via `selftune workflows`, and codifies them to SKILL.md via `selftune workflows save`. Includes 48 tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…9613 Add workflow discovery and CLI command (v0.3)
BUG-1: Remove false-positive git hook checks, fix hook key names to PascalCase BUG-2: Auto-derive expectations from SKILL.md when none provided BUG-3: Add --help output to grade command documenting --session-id BUG-4: Prefer skills_invoked over skills_triggered in session matching BUG-5: Add pre-flight validation and human-readable errors to evolve BUG-6: Distinguish real Skill tool calls from SKILL.md browsing reads IMP-1: Confirmed templates/ in package.json files array IMP-2: Auto-install agent files during init IMP-3: Show UNGRADED instead of CRITICAL when no graded sessions exist IMP-4: Use portable npx selftune hook <name> instead of absolute paths IMP-5: Add selftune auto-grade command IMP-6: Mandate AskUserQuestion in evolve workflows IMP-7: Add selftune quickstart command Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Hook files guard execution behind import.meta.main, so dynamically importing them was a no-op. Spawn as subprocess instead so stdin payloads are processed and hooks write telemetry logs correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Walkthrough
This PR adds new CLI commands (auto-grade, quickstart, hook), implements expectation auto-derivation from SKILL.md, distinguishes actual skill invocations from passive SKILL.md reads, adds an UNGRADED skill status, migrates hook checks to Claude Code settings.json, and enhances evolve CLI pre-flight validation, verbose logging, and post-run/error messaging.
## Changes
|Cohort / File(s)|Summary|
|---|---|
|**CLI Routing & Init** <br> `cli/selftune/index.ts`, `cli/selftune/init.ts`|Adds `auto-grade`, `quickstart`, and `hook` command routing; `installAgentFiles()` to deploy bundled agent files; expands Claude Code hook detection (4 keys) and supports nested hook entries.|
|**Evolution CLI** <br> `cli/selftune/evolution/evolve.ts`|Pre-flight checks for `--skill-path`/SKILL.md and eval-set paths, verbose config dump, enhanced post-run deployment/diagnostic messaging, and improved top-level error/troubleshooting output.|
|**Automated Grading** <br> `cli/selftune/grading/auto-grade.ts` (new), `cli/selftune/grading/grade-session.ts`|New auto-grade CLI; adds `deriveExpectationsFromSkill()` (exported) to extract up to 5 expectations from SKILL.md and uses it when explicit evals are missing; session selection now prefers `skills_invoked`.|
|**Hook / Skill Eval** <br> `cli/selftune/hooks/skill-eval.ts`, `cli/selftune/observability.ts`|Adds `hasSkillToolInvocation()` and updates `processToolUse` to set `triggered` based on detection; observability now checks Claude Code `settings.json` (flat and nested formats) instead of repo .git/hooks.|
|**Telemetry & Transcript** <br> `cli/selftune/types.ts`, `cli/selftune/utils/transcript.ts`, `cli/selftune/ingestors/claude-replay.ts`|Adds `skills_invoked` to telemetry/transcript types and populates it from tool_use entries; ingestion prefers `skills_invoked` and sets `triggered` accordingly (fallback to previous behavior).|
|**Status / Badge** <br> `cli/selftune/status.ts`, `cli/selftune/badge/badge-data.ts`|Introduces new SkillStatus value `UNGRADED` and extends BadgeData.status to include `UNGRADED`; status logic and display updated to treat skills with records but no triggered (graded) sessions as UNGRADED.|
|**Quickstart Onboarding** <br> `cli/selftune/quickstart.ts` (new)|Adds three-step quickstart flow (init, replay, status) with `suggestSkillsToEvolve()` ranking ungraded/critical/warning skills.|
|**Eval set behavior** <br> `cli/selftune/eval/hooks-to-evals.ts`|`buildEvalSet` now filters positives/negatives to only include records where `triggered === true`.|
|**Templates & Docs** <br> `templates/*-settings.json`, `skill/Workflows/*.md`|Replaces hardcoded path-based hook invocations with `npx selftune hook <name>`; updates Evolve workflow docs to require AskUserQuestion for user inputs.|
|**Tests & Fixtures** <br> `tests/*` (multiple)|Adds/updates tests for derived expectations, skills_invoked precedence, triggered=true/false cases, observability hook formats, UNGRADED state, and eval positives filtering; updates SKILL.md fixtures.|
## Sequence Diagram(s)
```mermaid
sequenceDiagram
participant User
participant CLI as auto-grade<br/>CLI
participant Telemetry as Telemetry<br/>Log
participant SkillMD as SKILL.md
participant PreGates as Pre-Gates
participant Agent as LLM<br/>Agent
participant Output as JSON<br/>Output
User->>CLI: auto-grade --skill X [--session-id Y]
CLI->>Telemetry: Load telemetry log
Telemetry-->>CLI: Sessions list
CLI->>CLI: Resolve session (by ID or latest using skills_invoked)
CLI->>SkillMD: Derive expectations (if needed)
SkillMD-->>CLI: Expectations (derived or none)
CLI->>PreGates: Run pre-gates on expectations
alt All resolved
PreGates-->>CLI: All expectations resolved
else Some unresolved
PreGates-->>CLI: Unresolved list
CLI->>Agent: gradeViaAgent(unresolved)
Agent-->>CLI: Graded expectations
end
CLI->>CLI: Compile results, metrics, summary
CLI->>Output: Write GradingResult JSON
Output-->>User: Summary + file path
```
```mermaid
sequenceDiagram
participant User
participant CLI as quickstart<br/>CLI
participant Config as Config<br/>Check
participant Replay as Replay<br/>Ingestor
participant Telemetry as Telemetry<br/>Log
participant Status as Status<br/>Engine
participant Skills as Skills<br/>Scorer
User->>CLI: quickstart
CLI->>Config: Check config/marker
alt Config missing
CLI->>CLI: runInit()
end
CLI->>Replay: Check replay marker
alt Marker missing
Replay->>CLI: Ingest transcripts (parseTranscript -> skills_invoked)
CLI->>Telemetry: Update logs/marker
end
CLI->>Status: Compute skill statuses (uses triggered & pass rates)
Status-->>CLI: Formatted status
CLI->>Skills: suggestSkillsToEvolve()
Skills-->>CLI: Top ranked skills
CLI->>User: Display suggestions
```
```mermaid
sequenceDiagram
participant Transcript as File
participant Parser as parseTranscript
participant Detector as Invocation<br/>Detector
participant Telemetry as Session<br/>Record
participant Ingestor as claude-replay
Parser->>Transcript: Read JSONL lines
loop each message
Parser->>Detector: Check tool_use entries
alt tool_use.toolName == "Skill"
Detector-->>Parser: Extract skill name
Parser->>Parser: Add to skills_invoked[]
else
Detector-->>Parser: ignore
end
end
Parser->>Telemetry: Emit TranscriptMetrics (skills_invoked)
Ingestor->>Telemetry: Create SkillUsageRecord (use skills_invoked if present)
```
## Estimated code review effort
🎯 4 (Complex) | ⏱️ ~60 minutes
## Possibly related PRs
- **#23**: Overlaps evolve CLI changes in `cli/selftune/evolution/evolve.ts` (pre-flight, logging, messaging).
- **#13**: Related telemetry/ingestion changes around `skills_invoked` and session selection behavior.
- **#17**: Touches the same ingestor surface (`cli/selftune/ingestors/claude-replay.ts`) and skill-invocation/trigger semantics.
* Update selftune workflow docs and skill versioning * Improve selftune skill portability and setup docs * Clarify workflow doc edge cases * Fix OpenClaw doctor validation and workflow docs * Polish composability and setup docs
* Fix BUG-7, BUG-8, BUG-9 from demo findings BUG-7: Add try/catch + array validation around eval-set file loading in evolve() so parse errors surface as user-facing messages instead of silent exit. BUG-8: Add cold-start bootstrap — when extractFailurePatterns returns empty but the eval set has positive entries, treat those positives as missed queries so evolve can work on skills with zero usage history. BUG-9: Add --out flag to evals CLI parseArgs as alias for --output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix evolve CI regressions * Isolate blog proof fixture mutations --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Fix dashboard export and layout * Improve telemetry normalization groundwork * Add test runner state * Separate and extract telemetry contract * Fix telemetry CI lint issues * Fix remaining CI regressions * Detect zero-trigger monitoring regressions * Stabilize dashboard report route tests * Address telemetry review feedback
* Fix telemetry follow-up edge cases * Fix rollback payload and Codex prompt attribution * Tighten Codex rollout prompt tracking
* Prepare 0.2.1 release * Update README install path * Use trusted publishing for npm
* feat: consume @selftune/telemetry-contract as workspace package Replace relative path imports of telemetry-contract with the published @selftune/telemetry-contract workspace package. Adds workspace config to package.json and expands tsconfig includes to cover packages/*. Closes SEL-10 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(telemetry-contract): add versioning, metadata, and golden fixtures Add version 1.0.0 and package metadata (description, author, license, repository) to the telemetry-contract package. Create golden fixture file with one valid example per record kind and a test suite that validates all fixtures against the contract validator. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Make selftune source-truth driven * Harden live dashboard loading * Audit cleanup: test split, docs, lint fixes - Add make test-fast / test-slow targets (5s vs 80s, 16x faster dev loop) - Add bun run test:fast / test:slow scripts in package.json - Reposition README as "Claude Code first", update competitive comparison - Bump PRD.md version to 0.2.1 - Add CHANGELOG unreleased section (source-truth, telemetry-contract, test split) - Fix pre-existing lint: types.ts formatting, golden.test.ts import order Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add sync flags and hook dispatch to integration guide - Document all selftune sync flags (--since, --dry-run, --force, etc.) - Add selftune hook dispatch command with all 6 hook names - Verified init, activation rules, and source-truth sections already current Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Harden LLM calls and fix pre-existing test failures Add exponential backoff retry to callViaAgent for transient subprocess failures. Cap JSONL health-check validation at 500 lines to prevent timeouts on large log files. Use exported DEFAULT_WINDOW_SESSIONS constant in dashboard data collection instead of telemetry.length. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add SQLite materialization layer for dashboard queries Add a local SQLite database (via bun:sqlite) as an indexed materialized view store so the dashboard/report UX no longer depends on recomputing everything from raw JSONL logs on every request. New module at cli/selftune/localdb/ with: - schema.ts: 10 tables + 19 indexes mirroring canonical telemetry and local log shapes - db.ts: openDb() lifecycle with WAL mode, meta key-value helpers - materialize.ts: full rebuild and incremental materialization from JSONL source-of-truth logs - queries.ts: getOverviewPayload(), getSkillReportPayload(), getSkillsList() query helpers Raw JSONL logs remain authoritative — the DB is a disposable cache that can always be rebuilt. No new npm dependencies (bun:sqlite only). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve Biome lint and format errors Auto-fix import ordering, formatting, and replace non-null assertions with optional chaining in tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add selftune orchestrate command for autonomous core loop Introduces `selftune orchestrate` — a single entry point that chains sync → status → evolve → watch into one coordinated run. Defaults to dry-run mode with explicit --auto-approve for deployments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review — lint errors, logic bug, and type completeness - Replace string concatenation with template literals (Biome lint) - Add guard in evolve loop for agent-missing skip mutations - Replace non-null assertion with `as string` cast - Remove unused EvolutionAuditEntry import - Complete DoctorResult mock with required fields Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: apply Biome formatting and import sorting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Bumps [oven-sh/setup-bun](https://github.com/oven-sh/setup-bun) from 2.1.2 to 2.1.3. - [Release notes](https://github.com/oven-sh/setup-bun/releases) - [Commits](oven-sh/setup-bun@3d26778...ecf28dd) --- updated-dependencies: - dependency-name: oven-sh/setup-bun dependency-version: 2.1.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 6.2.0 to 6.3.0. - [Release notes](https://github.com/actions/setup-node/releases) - [Commits](actions/setup-node@6044e13...53b8394) --- updated-dependencies: - dependency-name: actions/setup-node dependency-version: 6.3.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.32.4 to 4.32.6. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@89a39a4...0d579ff) --- updated-dependencies: - dependency-name: github/codeql-action dependency-version: 4.32.6 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Improve sync progress and tighten query filtering * Fix biome formatting errors in sync.ts and query-filter.test.ts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add generic scheduling command and reposition OpenClaw cron as optional The primary automation story is now agent-agnostic. `selftune schedule` generates ready-to-use snippets for system cron, macOS launchd, and Linux systemd timers. `selftune cron` is repositioned as an optional OpenClaw integration rather than the main automation path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review — centralize schedule data, fix generators and formatting Derive SCHEDULE_ENTRIES from DEFAULT_CRON_JOBS (single source of truth), generate launchd/systemd configs for all 4 entries instead of sync-only, fix biome formatting, and add markdown language tag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use StartCalendarInterval for fixed-time launchd and shell wrappers for chained commands - launchd: use StartCalendarInterval (Hour/Minute/Weekday) for fixed-time schedules instead of approximating with StartInterval - launchd/systemd: use /bin/sh -c wrapper for commands with && chains so prerequisite steps (like sync) are not silently dropped Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add local dashboard SPA with React + Vite Introduces a minimal React SPA at apps/local-dashboard/ with two routes: overview (KPIs, skill health grid, evolution feed) and per-skill drilldown (pass rate, invocation breakdown, evaluation records). Consumes existing dashboard-server API endpoints with SSE live updates, explicit loading/ error/empty states, and design tokens matching the current dashboard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review feedback for local dashboard Extract shared utils (deriveStatus, formatRate, timeAgo), add SSE exponential backoff with max retries, filter ungraded skills from avg pass rate, fix stuck loading state for undefined skillName, use word-boundary regex for evolution filtering, add focus-visible styles, add typecheck script, and add Vite env types. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address second round CodeRabbit review feedback Cancel pending SSE reconnect timers on cleanup, add stale-request guard to useSkillReport, remove redundant decodeURIComponent (React Router already decodes), quote font names in CSS for stylelint, and format deriveStatus signature for Biome. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: align local dashboard SPA with SQLite v2 data architecture Migrate SPA from old JSONL-reading /api/data endpoints to new SQLite-backed /api/v2/* endpoints. Add v2 server routes for overview and per-skill reports. Replace SSE with 15s polling. Rewrite types to match materialized query shapes from queries.ts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review feedback on local dashboard SPA - Add language identifier to HANDOFF.md fenced code block (MD040) - Prevent overlapping polls in useOverview with in-flight guard and sequential setTimeout - Broaden empty-state check in useSkillReport to include evolution/proposals - Fix Sessions KPI to use counts.sessions instead of counts.telemetry - Wrap materializeIncremental in try/catch to preserve last good snapshot on failure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: sort imports to satisfy Biome organizeImports lint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit nitpicks — cross-platform dev script, stricter types, CSS compat - Use concurrently for cross-platform dev script instead of shell backgrounding - Tighten Sidebar counts prop to Partial<Record<SkillHealthStatus, number>> - Replace color-mix() with rgba fallback for broader browser support Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add UNKNOWN status filter and extract header height CSS variable - Add UNKNOWN to STATUS_OPTIONS so all SkillHealthStatus values are filterable - Extract hardcoded 56px header height to --header-h CSS variable Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: hoist sidebar collapse state to layout and add UNKNOWN filter style - Lift collapsed state from Sidebar to Overview so grid columns resize properly - Add .sidebar-collapsed grid rules at all breakpoints - Fix mobile: collapsed sidebar no longer creates dead-end (shows inline) - Add .filter-pill.active.filter-unknown CSS rule for UNKNOWN status Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: serve SPA as default dashboard, legacy at /legacy/ - Dashboard server now serves built SPA from apps/local-dashboard/dist/ at / - Legacy dashboard moved to /legacy/ route - SPA fallback for client-side routes (e.g. /skills/:name) - Static asset serving with content-hashed caching for /assets/* - Path traversal protection on static file serving - Add build:dashboard script to root package.json - Include apps/local-dashboard/dist/ in published files - Falls back to legacy dashboard if SPA build not found Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add shadcn theming with dark/light toggle and selftune branding Migrate dashboard to shadcn theme system with proper light/dark support. Dark mode uses selftune site colors (navy/cream/copper), light mode uses standard shadcn defaults. Add ThemeProvider with localStorage persistence, sun/moon toggle in site header, and SVG logo with currentColor for both themes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: path traversal check and 404 for missing skills Use path.relative() + isAbsolute() instead of startsWith() for the SPA static asset path check to prevent directory traversal bypass. Return 404 from /api/v2/skills/:name when the skill has no usage data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: biome formatting — semicolons, import order, line length Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review — dedupe polling, fix stale closures, harden theme/config - Lift useOverview to DashboardShell, pass as prop to Overview (no double polling) - Fix stale closure in drag handler by deriving indices from prev state - Validate localStorage theme values, use undefined context default - Add relative positioning to theme toggle button for MoonIcon overlay - Fix falsy check hiding zero values in chart tooltip - Fix invalid Tailwind selectors in dropdown-menu and toggle-group - Use ESM-safe fileURLToPath instead of __dirname in vite.config - Switch manualChunks to function form for Base UI subpath matching - Align pass-rate threshold with deriveStatus in SkillReport - Use local theme provider in sonner instead of next-themes - Add missing React import in skeleton, remove unused Separator import - Include vite.config.ts in tsconfig for typecheck coverage - Fix inconsistent JSX formatting in select scroll buttons Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review round 2 — shared sorting, DnD fixes, Tailwind v4 migration - Extract sortByPassRateAndChecks to utils.ts, dedupe sorting in App + Overview - Derive DnD dataIds from row model (not raw data), guard against -1 indexOf - Hide pagination when table is empty instead of showing "Page 1 of 0" - Fix ActivityTimeline default tab to prefer non-empty dataset - Import ReactNode directly instead of undeclared React namespace - Quote CSS attribute selector in chart style injection - Use stable composite keys for tooltip and legend items - Remove unnecessary "use client" directive from dropdown-menu (Vite SPA) - Migrate outline-none to outline-hidden for Tailwind v4 accessibility - Fix toggle-group orientation selectors to match data-orientation attribute - Add missing CSSProperties import in sonner.tsx - Add dark mode variant for SkillReport row highlight - Format vite.config.ts with Biome Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add evidence viewer, evolution timeline, and enhanced skill report Add EvidenceViewer, EvolutionTimeline, and InfoTip components. Enhance SkillReport with richer data display, expand dashboard server API endpoints, and update documentation and architecture docs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review round 3 — DnD/sort conflict, theme listener, formatting - Disable DnD reorder when table sorting is active (skill-health-grid) - Listen for OS theme preference changes when system theme is active - Apply Biome formatting to sortByPassRateAndChecks - Remove unused useEffect import from Overview - Deduplicate confidence filter in SkillReport - Materialize session IDs once in dashboard-server to avoid repeated subqueries Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: show selftune version in sidebar footer Pass version from API response through to AppSidebar and display it dynamically instead of hardcoded "dashboard v0.1". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: biome formatting in dashboard-server — line length wrapping Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review round 4 — dedupe formatRate, STATUS_CONFIG, cleanup - Remove duplicate formatRate from app-sidebar, import from @/utils - Extract STATUS_CONFIG to shared @/constants module, import in both skill-health-grid and SkillReport - Remove misleading '' fallback from sessionPlaceholders since the ternary guards already skip queries when empty Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove redundant items prop from Select to avoid duplication The SelectItem children already define the options; the items prop was duplicating them unnecessarily. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add sortableKeyboardCoordinates to KeyboardSensor for proper keyboard DnD Without this, keyboard navigation moves by pixels instead of jumping between sortable items. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: Linear-style dashboard UX — collapsible sidebar, direct skill links, scope grouping - Simplify sidebar: remove status filters, keep logo + search + skills list - Add collapsible scope groups (Project/Global) using base-ui Collapsible - Surface skill_scope from DB query through API to dashboard types - Replace skill drawer with direct Link navigation to skill report - Add Scope column to skills table with filter dropdown - Slim down site header: remove breadcrumbs, reduce to sidebar trigger + theme toggle - Add side-by-side grid layout: skills table left, activity panel right - Gitignore pnpm-lock.yaml alongside bun.lock Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review — accessibility, semantics, state reset - Remove bun.lock from .gitignore to maintain build reproducibility - Preserve unexpected scope values in sidebar (don't drop unrecognized scopes) - Add aria-label to skill search input for screen reader accessibility - Switch status filter from checkbox to radio-group semantics (mutually exclusive) - Reset selectedProposal when navigating between skills via useEffect on name Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add TanStack Query and optimize SQL queries for dashboard performance Migrate data fetching from manual polling/dedup hooks to TanStack Query for instant cached navigation, background refetch, and request dedup. Optimize SQL: replace NOT IN subqueries with LEFT JOIN, move JS dedup to GROUP BY, add LIMIT 200 to unbounded evidence queries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: track root bun.lock for reproducible installs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review — collapsible sync, drag handle dedup, a11y, not-found heuristic - Make sidebar Collapsible controlled so it auto-opens when active skill changes (Comment #1) - Consolidate useSortable to single call per row via React context, use setActivatorNodeRef on drag handle button (Comment #2) - Remove capitalize CSS transform on free-form scope values (Comment #3) - Broaden isNotFound heuristic to check invocations, prompts, sessions in addition to evals/evolution/proposals (Comment #4) - Move Tooltip outside TabsTrigger to avoid nested interactive elements, use Base UI render prop for composition (Comment #5) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit nitpicks — version pinning, changelog clarity, shared query helper - Use caret range for recharts version (^2.15.4) for consistency - Clarify changelog: SSE was removed, polling via refetchInterval is primary - Extract getPendingProposals() shared helper in queries.ts, used by both getOverviewPayload() and dashboard-server skill report endpoint Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit round 3 — deps, async fs, type safety, deterministic query - Move @tailwindcss/vite, tailwindcss, shadcn to devDependencies - Fix trailing space in version display when version is empty - Type caught error as unknown in refreshV2Data - Replace sync fs (readFileSync/statSync) with Bun.file() for hot-path asset serving - Return 404 for missing /assets/* files instead of falling through to SPA - Add details and eval_set fields to SkillReportPayload.evidence type - Fix nondeterministic GROUP BY with ROW_NUMBER() CTE in getPendingProposals Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve Biome lint and format errors in CI - Replace non-null assertion with type cast in useSkillReport (noNonNullAssertion) - Break long import line in dashboard-server.ts to satisfy Biome formatter Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit round 4 — CTE subqueries, type alignment, scope index - Replace dynamic bind-parameter expansion with CTE subquery for session lookups - Add skill_name to OverviewPayload.pending_proposals type to match runtime shape - Add composite index on skill_usage(skill_name, skill_scope, timestamp) for scope lookups Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit round 5 — startup guard, 404 heuristic, deterministic tiebreaker - Guard initial v2 materialization with try/catch to avoid full server crash - Include evidence in not-found check so evidence-only skills aren't 404'd - Add ea.id DESC tiebreaker to ROW_NUMBER() for deterministic pending proposals Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address CodeRabbit round 6 — db guard, refresh throttle, deferred 404 - Guard openDb() in try/catch so DB bootstrap failure doesn't crash server - Make db nullable, return 503 from /api/v2/* when store is unavailable - Throttle failed refresh attempts with separate lastV2RefreshAttemptAt timestamp - Move skill 404 check after enrichment queries (evolution, proposals, invocations) - Use optional chaining for db.close() on shutdown Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Promote product planning docs * Add execution plans for product gaps and evals * Prepare SPA dashboard release path * Remove legacy dashboard runtime * Refresh execution plans after dashboard cutover * Build dashboard SPA in CI and publish * Refresh README for SPA release path * Address dashboard release review comments * Fix biome lint errors in dashboard tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Make autonomous loop the default scheduler path * Document orchestrate as the autonomous loop * Document autonomy-first setup path * Harden autonomous scheduler install paths * Clarify sync force usage in README --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Orchestrate output now explains each decision clearly so users can trust the autonomous loop. Adds formatOrchestrateReport() with 5-phase human report (sync, status, decisions, evolution, watch) and enriched JSON with per-skill decisions array. Supersedes PR #45. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Important Review skippedToo many files! This PR contains 223 files, which is 73 over the limit of 150. ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: ⛔ Files ignored due to path filters (6)
📒 Files selected for processing (223)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
formatOrchestrateReport()— a 5-phase human-readable decision report showing sync sources, status breakdown, per-skill decisions with reasons, evolution results with validation pass-rate changes, and watch outcomes including alerts/rollbacksdecisionsarray containing per-skill action, reason, deployment status, validation scores, watch alerts, and rollback statusSupersedes #45 — absorbs its
formatOrchestrateReportconcept but fixes two issues: PR #45 introduced asummary.autoApprovefield that doesn't exist (the real field isapprovalMode), and referenced the deprecated--auto-approveflag in mode banners.What users can now see
Test plan
formatOrchestrateReporttested for all 3 modes, all 5 phases, empty states, rollback tags, rerun hintsselftune orchestrate --dry-runproduces clear phased report on stderrdecisionsarray with per-skill reasoning🤖 Generated with Claude Code