Skip to content

feat: QA checklist skill with full-stack UI#273

Open
backnotprop wants to merge 16 commits intomainfrom
feat/skills-structure
Open

feat: QA checklist skill with full-stack UI#273
backnotprop wants to merge 16 commits intomainfrom
feat/skills-structure

Conversation

@backnotprop
Copy link
Owner

Summary

  • New QA checklist feature — AI agents generate structured checklists for manual developer verification of code changes, with an interactive UI for reviewing items, adding notes/screenshots, and submitting results back to the agent
  • PR/MR integration — Optional linking to GitHub, GitLab, or Azure DevOps PRs with provider icons, automation toggles (post results to PR, auto-approve if all pass), and provider-specific CLI commands in output
  • Shared server infra — Extracted startServer() into packages/server/serve.ts and moved shared theme CSS into packages/ui/styles/theme.css to deduplicate across plan review, code review, and checklist servers

What's included

Layer Files Description
Skill .agents/skills/checklist/SKILL.md Agent instructions for generating checklists with PR detection
Types packages/shared/checklist-types.ts Shared Checklist, ChecklistPR, ChecklistSubmission types
Server packages/server/checklist.ts, serve.ts Validation, session management, feedback formatting
Editor packages/checklist-editor/ Full React UI — expandable items, progress bar, annotation panel, category grouping
Harnesses apps/hook/, apps/opencode-plugin/, apps/pi-extension/ Integration for Claude Code, OpenCode, and Pi
Dev apps/checklist/ Standalone Vite dev server for checklist editor

Test plan

  • bun run build passes
  • Run plannotator checklist --file /tmp/checklist.json with sample data — UI opens, items expand/collapse, notes work, submit returns formatted output
  • Verify PR badge renders correctly for GitHub, GitLab, and Azure DevOps providers
  • Verify automations checkboxes appear in annotation panel when PR is linked
  • Test dark and light mode parity
  • Test keyboard navigation (j/k, p/f/s, n, Enter)

🤖 Generated with Claude Code

backnotprop and others added 5 commits March 10, 2026 23:02
Extract startServer() into serve.ts to deduplicate port detection,
retry logic, and remote session handling across plan/review/annotate
servers. Move theme CSS variables from review-editor into shared
packages/ui/styles/theme.css. Add shared timeFormat utility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a new checklist feature that lets AI agents generate QA checklists
for manual developer verification of code changes. Includes:

- Checklist editor UI with expandable items, inline notes, progress bar,
  category grouping with compact collapse, and annotation panel
- PR/MR integration supporting GitHub, GitLab, and Azure DevOps with
  provider-specific icons, automation toggles, and CLI command output
- Server-side validation, formatting, and session management
- Harness integrations for Claude Code, OpenCode, and Pi
- Skill instructions (.agents/skills/checklist/SKILL.md) with PR
  detection and --file flag for large JSON payloads

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The @tailwindcss/vite plugin resolves tailwindcss from the CSS file's
directory, not the app's. CI failed because tailwindcss wasn't listed
in checklist-editor's package.json (only hoisted locally).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These are build artifacts generated by copying from apps/checklist/dist,
same pattern as plannotator.html and review-editor.html which were
already gitignored.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move the capabilities/features summary to the top of the Codex and
Claude Code plugin READMEs so all features are visible immediately.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Owner Author

@backnotprop backnotprop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found several correctness gaps in the checklist feature:

  • P1: Refuse PR auto-approval while checklist items are still pending (packages/server/checklist.ts:296-301)
    If the user submits a partial checklist with approveIfAllPass enabled, this branch still emits PR-approval instructions because it only checks failed === 0 && skipped === 0. Pending items are excluded from both counts, so an incomplete QA run can incorrectly approve the PR.

  • P2: Surface a URL for remote checklist sessions (apps/hook/server/index.ts:344-346)
    In PLANNOTATOR_REMOTE/SSH sessions, handleChecklistServerReady() is effectively a no-op, and unlike plan/review/annotate mode there is no writeRemoteShareLink() fallback here. The new plannotator checklist command then waits on waitForDecision() without opening a browser or telling the user where the UI is, so remote/devcontainer users cannot reach the checklist page from this flow.

  • P2: Restore saved results when reopening checklist files (apps/hook/server/index.ts:321-323)
    Saved checklist files contain { checklist, results, globalNotes, ... }, but this branch immediately unwraps them to .checklist and drops the recorded answers. Because the command output explicitly tells users to reopen a saved run with plannotator checklist --file ..., reopening currently starts from a blank checklist and loses the previous verification work.

  • P2: Include the OpenCode agent switch in checklist submissions (packages/checklist-editor/App.tsx:338-345)
    The checklist header reuses the review Settings dialog, so OpenCode users can pick a target agent here, but the checklist POST body never sends that choice. As a result checklist feedback always goes back on the current agent even when the UI says it should switch to build, review, or another configured agent.

  • P2: Keep the Pi checklist server compatible with the shared UI API (apps/pi-extension/server.ts:766-770)
    The shared checklist UI always uses /api/upload, /api/image, and /api/draft, and it posts array-valued globalNotes plus optional automations. This Node fallback only accepts a much narrower /api/feedback payload and does not implement those extra routes, so in Pi mode screenshot uploads and draft recovery silently fail, global comments are dropped, and the PR automation checkboxes have no effect.

  • P2: Escape checklist description HTML before rendering it (packages/checklist-editor/components/ChecklistItem.tsx:188-194)
    If a checklist is loaded from an untrusted JSON/file (for example via plannotator checklist --file), raw HTML in description is passed straight into dangerouslySetInnerHTML after only regex replacements for code and bold text. That allows injected elements with event handlers such as <img onerror=...> to execute script inside the Plannotator page.

backnotprop and others added 11 commits March 11, 2026 12:44
- Prevent auto-approval on incomplete checklists (pending items now counted)
- Fix XSS in description rendering by replacing dangerouslySetInnerHTML with React elements
- Preserve saved results when reopening checklists via --file flag
- Fix Pi extension notes type mismatch (string vs string[]) and add automations support
- Add project scoping to OpenCode checklist server
- Add remote/SSH share link for checklist sessions
- Update CLAUDE.md with checklist server API, project structure, and build commands

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ecklist

- Add image validation, serving, upload, and draft persistence helpers
  (Node-compatible duplicates of packages/server/image.ts and draft.ts)
- Add /api/image, /api/upload, /api/draft routes to all four Pi servers
  (plan, review, annotate, checklist) — previously silently failed
- Fix 6 checklist divergences from canonical:
  - Add PR field validation to validateChecklist
  - Align formatChecklistFeedback automation output with canonical
  - Fix saveChecklistResults globalNotes type (string → string[] | string)
  - Add initialResults/initialGlobalNotes support to startChecklistServer
  - Add onReady callback to startChecklistServer
  - Add draft cleanup on checklist submission
- Remove unused formatChecklistFeedback import from OpenCode plugin

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a coverage visualization toggle to the checklist UI. When the agent
provides fileDiffs and diffMap data, users can switch between the standard
checklist view and a diagnostic coverage map showing a file tree with
colored waffle cells (red=failed, yellow=skipped, green=passed, gray=pending).

- Extract ToolstripButton from AnnotationToolstrip for reuse
- Add fileDiffs/diffMap to checklist data model and validation
- Coverage view renders inside document boundary with glassmorphic styling
- Stacked/side-by-side layout toggle within coverage view
- Compact icon-only status buttons in side-by-side mode
- Keyboard shortcut (v) to toggle between views
- Update checklist skill with fileDiffs/diffMap guidance

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enrich fileDiffs from Record<string, number> to support FileDiffInfo
objects ({hunks, lines, status}) alongside the legacy number format.
Coverage uses hunks; PR Balance uses lines + status to render a U-shaped
bar chart (modified descending left, new ascending right) with
center-of-mass indicator and collapsible squarified treemap bins.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…header

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Clicking pass/fail/skip (or pressing p/f/s) now collapses the expanded
item so the reviewer flows naturally to the next check. PR Balance card
gets a solid inner surface matching coverage view styling. Demo data
expanded to 30 files with heavy new-code weighting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolved two conflicts:
- packages/server/annotate.ts: added /api/doc route from main before
  existing feedback handler
- packages/ui/components/AnnotationToolstrip.tsx: kept extracted
  ToolstripButton from this branch, dropped main's inline duplicate

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reuses PLAN_WIDTH_OPTIONS from the plan editor but stores under a
separate cookie (plannotator-checklist-width). Settings dialog shows
the Display tab in checklist mode with a width picker. Document
container applies the selected maxWidth via inline style.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
View mode (checklist vs coverage) and coverage layout (stacked vs
side-by-side) now survive page reloads using cookie-based storage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant