backnotprop · backnotprop · Mar 11, 2026 · Mar 11, 2026 · Mar 11, 2026 · Mar 11, 2026
diff --git a/.agents/skills/checklist/SKILL.md b/.agents/skills/checklist/SKILL.md
@@ -0,0 +1,171 @@
+---
+name: checklist
+description: >
+  Generate a QA checklist for manual developer verification of code changes.
+  Use when the user wants to verify completed work, review a diff for quality,
+  create acceptance criteria checks, or run through QA steps before shipping.
+  Triggers on requests like "create a checklist", "what should I test",
+  "verify my changes", "QA this", or "pre-flight check".
+disable-model-invocation: true
+---
+
+# QA Checklist
+
+You are a senior QA engineer. Your job is to analyze the current code changes and produce a **QA checklist** — a structured list of verification tasks the developer needs to manually review before the work is considered done.
+
+This is not a code review. Code reviews catch style issues and logic bugs in the diff itself. A QA checklist catches the things that only a human can verify by actually running, clicking, testing, and observing the software. You're producing the verification plan that bridges "the code looks right" to "the software actually works."
+
+## Principles
+
+**Focus on what humans must verify.** If an automated test already covers something with meaningful assertions, it doesn't need a checklist item. But "tests exist" is not enough — test coverage that only asserts existence or happy-path behavior still leaves gaps that need human eyes.
+
+**Be specific, not vague.** "Test the login flow" is useless. "Verify that login with an expired JWT returns a 401 with `{error: 'token_expired'}` body, not a 500 with a stack trace" tells the developer exactly what to check, what to expect, and what failure looks like.
+
+**Every item is a mini test case.** Each checklist entry should have enough context that a developer unfamiliar with the change could pick it up and verify it. The description explains the change and the risk. The steps walk through the exact verification procedure. The expected outcome is clear.
+
+**Fewer good items beat many shallow ones.** Aim for 5–15 items. If you're producing more than 15, you're generating busywork — prioritize the items where human verification actually matters. If you're producing fewer than 5, look harder at edge cases, integration points, and deployment concerns.
+
+## Workflow
+
+### 1. Gather Context
+
+Start by understanding what changed and why.
+
+```bash
+git diff HEAD
+```
+
+If that's empty, try the branch diff:
+
+```bash
+git diff main...HEAD
+```
+
+As you read the diff, build a mental model:
+
+- **What kind of change is this?** New feature, bug fix, refactor, dependency update, config/infra change. This determines which categories of verification matter most.
+- **Which files changed and what do they do?** UI components need visual verification. API routes need functional testing. Database migrations need data integrity checks. Config files need deployment verification.
+- **Do tests exist for this code?** Look for test files related to the changed code. Tests that meaningfully cover the changed behavior reduce the need for manual verification — but tests that only cover the happy path or assert existence still leave gaps.
+
+As you read the diff, count the number of diff hunks (`@@` markers) per file. You'll use these counts in step 3 to populate `fileDiffs` and `diffMap`.
+
+Also collect line counts and new/modified status for the PR Balance visualization:
+```bash
+# Line counts per file (added + removed)
+git diff --stat HEAD | head -n -1
+# New files (status "new"), everything else is "modified"
+git diff --diff-filter=A --name-only HEAD
+```
+
+### 2. Decide What Needs Manual Verification
+
+Think about each change through the lens of what could go wrong that a human needs to catch. Consider categories like:
+
+- **Visual** — Does it look right? Layout, responsiveness, dark mode, animations, color contrast. Only relevant when UI files changed.
+- **Functional** — Does the feature work end-to-end? Happy path and primary error paths. Always relevant for new features and bug fixes.
+- **Edge cases** — Empty input, huge input, special characters, concurrent access, timezone issues. Focus on cases the diff suggests are likely, not every theoretical scenario.
+- **Integration** — Does this break callers or consumers? API contract changes, event format changes, shared state mutations.
+- **Security** — Auth checks on new endpoints, input sanitization, secrets exposure, CORS changes.
+- **Data** — Database migrations, schema changes, backwards compatibility, data format changes.
+- **Performance** — Only when the diff touches hot paths, adds queries, or changes data structures.
+- **Deployment** — New environment variables, feature flags, migration ordering, new dependencies.
+- **Developer experience** — Error messages, documentation, CLI help text, logging.
+
+These are suggestions, not a fixed list. Use whatever category label best describes the type of verification. If the change involves "api-contract" or "accessibility" or "offline-behavior," use that.
+
+### 3. Generate the Checklist JSON
+
+Produce a JSON object with this structure:
+
+```json
+{
+  "title": "Short title for the checklist",
+  "summary": "One paragraph explaining what changed and why manual verification matters.",
+  "pr": {
+    "number": 142,
+    "url": "https://github.com/org/repo/pull/142",
+    "title": "feat: add OAuth2 support",
+    "branch": "feat/oauth2",
+    "provider": "github"
+  },
+  "fileDiffs": {
+    "src/middleware/auth.ts": { "hunks": 5, "lines": 320, "status": "modified" },
+    "src/pages/login.tsx": { "hunks": 3, "lines": 180, "status": "modified" },
+    "src/lib/api-client.ts": { "hunks": 4, "lines": 250, "status": "new" }
+  },
+  "items": [
+    {
+      "id": "category-N",
+      "category": "free-form category label",
+      "check": "Imperative verb phrase — the headline",
+      "description": "Markdown narrative explaining what changed in the code, what could go wrong, what the expected behavior is, and how the developer knows the test passes.",
+      "steps": [
+        "Step 1: Do this specific thing",
+        "Step 2: Observe this specific result",
+        "Step 3: Confirm this specific expectation"
+      ],
+      "reason": "Why this needs human eyes — what makes it not fully automatable.",
+      "files": ["src/middleware/auth.ts", "src/pages/login.tsx"],
+      "diffMap": { "src/middleware/auth.ts": 3, "src/pages/login.tsx": 2 },
+      "critical": false
+    }
+  ]
+}
+```
+
+**Field guidance:**
+
+- **`pr`** (optional): Include when the checklist is associated with a pull/merge request. The UI displays a PR badge in the header and enables automation options (post results as a PR comment, auto-approve if all checks pass). Detect the provider from the git remote:
+  - `github.com` → `"provider": "github"`
+  - `gitlab.com` or self-hosted GitLab → `"provider": "gitlab"`
+  - `dev.azure.com` or `visualstudio.com` → `"provider": "azure-devops"`
+
+  To detect if a PR exists for the current branch:
+  ```bash
+  # GitHub
+  gh pr view --json number,url,title,headRefName 2>/dev/null
+  # GitLab
+  glab mr view --output json 2>/dev/null
+  # Azure DevOps
+  az repos pr list --source-branch "$(git branch --show-current)" --output json 2>/dev/null
+  ```
+  If the command succeeds, populate the `pr` field. If it fails (no PR exists, CLI not installed), omit it entirely. Do not error on missing CLIs — the `pr` field is optional.
+
+- **`id`**: Prefix with a short category tag and number: `func-1`, `sec-2`, `visual-1`. This makes items easy to reference in feedback.
+- **`category`**: Free-form string. Pick the label that best describes the verification type. Common ones: `visual`, `functional`, `edge-case`, `integration`, `security`, `data`, `performance`, `deployment`, `devex`.
+- **`check`**: The headline. Always starts with a verb: Verify, Confirm, Check, Test, Ensure, Open, Navigate, Run. This is what appears as the checklist item label.
+- **`description`**: The heart of the item. Write this as a markdown narrative that tells the full story:
+  - What changed in the code (reference specific files/functions)
+  - What could go wrong as a result
+  - What the expected behavior should be
+  - How the developer knows the test passes vs fails
+- **`steps`**: Required. Ordered instructions for conducting the verification. Be concrete — "Open browser devtools" not "check the network." Each step should be a single clear action.
+- **`reason`**: One sentence explaining why automation can't fully cover this. "CSS grid rendering varies across browsers" is good. "Because it changed" is not.
+- **`files`**: File paths from the diff that this item relates to. Helps the developer trace your reasoning. Optional when `diffMap` is provided (derivable from its keys).
+- **`diffMap`**: Object mapping file paths to the number of diff hunks in that file that this check exercises. Paths must be keys in `fileDiffs`. Multiple items can cover the same hunks — that's expected (many-to-many). Example: `{ "src/middleware/auth.ts": 3, "src/pages/login.tsx": 2 }`.
+- **`fileDiffs`** (on the top-level checklist, not per-item): Object mapping each changed file's relative path to its diff metadata. Each value is `{ "hunks": N, "lines": N, "status": "new" | "modified" }`. `hunks` = count of `@@` markers in the diff. `lines` = total lines changed (from `git diff --stat`). `status` = `"new"` for added files, `"modified"` for everything else. This enables coverage visualization (hunks) and PR Balance (lines + status). Legacy format (plain number = hunk count) is still accepted but won't enable PR Balance.
+- **`critical`**: Reserve for items where failure means data loss, security vulnerability, or broken deployment. Typically 0–3 items per checklist.
+
+### 4. Launch the Checklist UI
+
+Write your JSON to a temporary file and pass it via `--file`:
+
+```bash
+cat > /tmp/checklist.json << 'CHECKLIST_EOF'
+<your-json-here>
+CHECKLIST_EOF
+plannotator checklist --file /tmp/checklist.json
+```
+
+This avoids shell quoting issues with large or complex JSON. The UI opens for the developer to work through each item — marking them as passed, failed, or skipped with notes and screenshot evidence. Wait for the output — it contains the developer's results.
+
+### 5. Respond to Results
+
+When the checklist results come back:
+
+- **All passed**: The verification is complete. Acknowledge it and move on.
+- **Items failed**: Read the developer's notes carefully. Fix the issue if you can. If the current behavior is actually correct, explain why.
+- **Items skipped**: Note the reason. If items were skipped as "not applicable," your checklist may have been too broad for this change — take that as feedback.
+- **Questions attached**: Answer them directly, with references to the relevant code.
+
+$ARGUMENTS
diff --git a/.gitignore b/.gitignore
@@ -19,13 +19,15 @@ dist-ssr
 # VS Code extension package
 *.vsix
 
-# OpenCode plugin build artifacts (generated from hook/review dist)
+# OpenCode plugin build artifacts (generated from hook/review/checklist dist)
 apps/opencode-plugin/plannotator.html
 apps/opencode-plugin/review-editor.html
+apps/opencode-plugin/checklist.html
 
-# Pi extension build artifacts (generated from hook/review dist)
+# Pi extension build artifacts (generated from hook/review/checklist dist)
 apps/pi-extension/plannotator.html
 apps/pi-extension/review-editor.html
+apps/pi-extension/checklist.html
 
 # Editor directories and files
 .vscode/*

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -44,6 +44,8 @@ plannotator/
 │   │   ├── draft.ts              # Annotation draft persistence (~/.plannotator/drafts/)
 │   │   ├── integrations.ts       # Obsidian, Bear integrations
 │   │   ├── ide.ts                # VS Code diff integration (openEditorDiff)
+│   │   ├── checklist.ts           # startChecklistServer(), formatChecklistFeedback()
+│   │   ├── serve.ts              # Shared Bun server startup (startServer)
 │   │   ├── editor-annotations.ts  # VS Code editor annotation endpoints
 │   │   └── project.ts            # Project name detection for tags
 │   ├── ui/                       # Shared React components
@@ -53,13 +55,19 @@ plannotator/
 │   │   ├── utils/                # parser.ts, sharing.ts, storage.ts, planSave.ts, agentSwitch.ts, planDiffEngine.ts
 │   │   ├── hooks/                # useSharing.ts, usePlanDiff.ts, useSidebar.ts, useLinkedDoc.ts, useAnnotationDraft.ts, useCodeAnnotationDraft.ts
 │   │   └── types.ts
-│   ├── shared/                   # Cross-package types (EditorAnnotation)
+│   ├── shared/                   # Cross-package types (EditorAnnotation, checklist-types)
 │   ├── editor/                   # Plan review App.tsx
-│   └── review-editor/            # Code review UI
-│       ├── App.tsx               # Main review app
-│       ├── components/           # DiffViewer, FileTree, ReviewPanel
-│       ├── demoData.ts           # Demo diff for standalone mode
-│       └── index.css             # Review-specific styles
+│   ├── review-editor/            # Code review UI
+│   │   ├── App.tsx               # Main review app
+│   │   ├── components/           # DiffViewer, FileTree, ReviewPanel
+│   │   ├── demoData.ts           # Demo diff for standalone mode
+│   │   └── index.css             # Review-specific styles
+│   └── checklist-editor/         # QA checklist UI
+│       ├── App.tsx               # Main checklist app
+│       ├── components/           # ChecklistItem, ChecklistGroup, ChecklistHeader, etc.
+│       ├── hooks/                # useChecklistState, useChecklistProgress, useChecklistDraft
+│       └── index.css             # Checklist-specific styles
+├── .agents/skills/checklist/      # QA checklist skill (SKILL.md)
 ├── .claude-plugin/marketplace.json  # For marketplace install
 └── legacy/                       # Old pre-monorepo code (reference only)
 ```
@@ -195,6 +203,16 @@ Send Annotations → feedback sent to agent session
 | `/api/upload`         | POST   | Upload image, returns `{ path, originalName }` |
 | `/api/draft`          | GET/POST/DELETE | Auto-save annotation drafts to survive server crashes |
 
+### Checklist Server (`packages/server/checklist.ts`)
+
+| Endpoint              | Method | Purpose                                    |
+| --------------------- | ------ | ------------------------------------------ |
+| `/api/checklist`      | GET    | Returns `{ checklist, origin, mode, initialResults?, initialGlobalNotes? }` |
+| `/api/feedback`       | POST   | Submit results (body: results, globalNotes, automations, agentSwitch) |
+| `/api/image`          | GET    | Serve image by path query param            |
+| `/api/upload`         | POST   | Upload image, returns `{ path, originalName }` |
+| `/api/draft`          | GET/POST/DELETE | Auto-save checklist drafts to survive server crashes |
+
 All servers use random ports locally or fixed port (`19432`) in remote mode.
 
 ### Paste Service (`apps/paste-service/`)
@@ -358,6 +376,7 @@ bun install
 # Run any app
 bun run dev:hook       # Hook server (plan review)
 bun run dev:review     # Review editor (code review)
+bun run dev:checklist  # Checklist editor (QA checklist)
 bun run dev:portal     # Portal editor
 bun run dev:marketing  # Marketing site
 bun run dev:vscode     # VS Code extension (watch mode)
@@ -368,7 +387,8 @@ bun run dev:vscode     # VS Code extension (watch mode)
 ```bash
 bun run build:hook       # Single-file HTML for hook server
 bun run build:review     # Code review editor
-bun run build:opencode   # OpenCode plugin (copies HTML from hook + review)
+bun run build:checklist  # QA checklist editor
+bun run build:opencode   # OpenCode plugin (copies HTML from hook + review + checklist)
 bun run build:portal     # Static build for share.plannotator.ai
 bun run build:marketing  # Static build for plannotator.ai
 bun run build:vscode     # VS Code extension bundle

diff --git a/apps/checklist/index.html b/apps/checklist/index.html
@@ -0,0 +1,18 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <title>QA Checklist</title>
+
+    <!-- Fonts -->
+    <link rel="preconnect" href="https://fonts.googleapis.com">
+    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
+    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500;600&display=swap" rel="stylesheet">
+
+  </head>
+  <body class="min-h-screen antialiased">
+    <div id="root" class="h-full"></div>
+    <script type="module" src="/index.tsx"></script>
+  </body>
+</html>
diff --git a/apps/checklist/index.tsx b/apps/checklist/index.tsx
@@ -0,0 +1,16 @@
+import React from 'react';
+import ReactDOM from 'react-dom/client';
+import App from '@plannotator/checklist-editor';
+import '@plannotator/checklist-editor/styles';
+
+const rootElement = document.getElementById('root');
+if (!rootElement) {
+  throw new Error("Could not find root element to mount to");
+}
+
+const root = ReactDOM.createRoot(rootElement);
+root.render(
+  <React.StrictMode>
+    <App />
+  </React.StrictMode>
+);
diff --git a/apps/checklist/package.json b/apps/checklist/package.json
@@ -0,0 +1,27 @@
+{
+  "name": "@plannotator/checklist",
+  "private": true,
+  "version": "0.0.1",
+  "type": "module",
+  "scripts": {
+    "dev": "vite",
+    "build": "vite build"
+  },
+  "dependencies": {
+    "@plannotator/checklist-editor": "workspace:*",
+    "@plannotator/server": "workspace:*",
+    "@plannotator/shared": "workspace:*",
+    "@plannotator/ui": "workspace:*",
+    "react": "^19.2.3",
+    "react-dom": "^19.2.3",
+    "tailwindcss": "^4.1.18",
+    "@tailwindcss/vite": "^4.1.18"
+  },
+  "devDependencies": {
+    "@vitejs/plugin-react": "^5.0.0",
+    "typescript": "~5.8.2",
+    "vite": "^6.2.0",
+    "vite-plugin-singlefile": "^2.0.3",
+    "@types/node": "^22.14.0"
+  }
+}
diff --git a/apps/checklist/vite.config.ts b/apps/checklist/vite.config.ts
@@ -0,0 +1,37 @@
+import path from 'path';
+import { defineConfig } from 'vite';
+import react from '@vitejs/plugin-react';
+import { viteSingleFile } from 'vite-plugin-singlefile';
+import tailwindcss from '@tailwindcss/vite';
+import pkg from '../../package.json';
+
+export default defineConfig({
+  server: {
+    port: 3002,
+    host: '0.0.0.0',
+  },
+  define: {
+    __APP_VERSION__: JSON.stringify(pkg.version),
+  },
+  plugins: [react(), tailwindcss(), viteSingleFile()],
+  resolve: {
+    alias: {
+      '@': path.resolve(__dirname, '.'),
+      '@plannotator/ui': path.resolve(__dirname, '../../packages/ui'),
+      '@plannotator/shared': path.resolve(__dirname, '../../packages/shared'),
+      '@plannotator/checklist-editor/styles': path.resolve(__dirname, '../../packages/checklist-editor/index.css'),
+      '@plannotator/checklist-editor': path.resolve(__dirname, '../../packages/checklist-editor/App.tsx'),
+    }
+  },
+  build: {
+    target: 'esnext',
+    assetsInlineLimit: 100000000,
+    chunkSizeWarningLimit: 100000000,
+    cssCodeSplit: false,
+    rollupOptions: {
+      output: {
+        inlineDynamicImports: true,
+      },
+    },
+  },
+});