From 1e06b6a5c6601afcf8b822d3820d3cf5a3c2319f Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Mon, 16 Mar 2026 10:59:13 -0500 Subject: [PATCH 01/22] fix: dynamic base branch detection across all SKILL templates (v0.3.10) (#81) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat: add {{BASE_BRANCH_DETECT}} resolver to gen-skill-docs DRY placeholder for dynamic base branch detection across PR-targeting skills. Detects via gh pr view (existing PR base) → gh repo view (repo default) → fallback to main. Co-Authored-By: Claude Opus 4.6 (1M context) * fix: ship skill detects base branch instead of hardcoding main Replaces ~14 hardcoded 'main' references with dynamic detection via {{BASE_BRANCH_DETECT}}. Fixes stacked branches and Conductor workspaces targeting non-main branches. Adds --base to gh pr create. Co-Authored-By: Claude Opus 4.6 (1M context) * fix: review, qa, plan-ceo-review detect base branch dynamically Same pattern as ship: replaces hardcoded 'main' with {{BASE_BRANCH_DETECT}}. Also cleans up qa bash-isms (REPORT_DIR variable, port chaining). Co-Authored-By: Claude Opus 4.6 (1M context) * fix: retro detects default branch instead of hardcoding origin/main Retro queries commit history (not PR targets), so uses simpler detection: gh repo view defaultBranchRef. Replaces ~11 origin/main refs with origin/. Co-Authored-By: Claude Opus 4.6 (1M context) * docs: add explicit cross-step references in gstack-upgrade template Bash blocks are self-contained, but cross-block variable references (INSTALL_DIR from Step 2) were implicit. Adds prose making them explicit. Co-Authored-By: Claude Opus 4.6 (1M context) * docs+test: SKILL authoring guidance + regression tests Adds "Writing SKILL templates" section to CLAUDE.md explaining that templates are prompts, not scripts. Adds validation test catching hardcoded 'main' in git commands, and resolver content test. Co-Authored-By: Claude Opus 4.6 (1M context) * docs: update ARCHITECTURE + CONTRIBUTING for new placeholders Add {{BASE_BRANCH_DETECT}} to ARCHITECTURE.md placeholder list. Cross-reference CLAUDE.md template authoring guidance from CONTRIBUTING.md. Co-Authored-By: Claude Opus 4.6 (1M context) * chore: bump version and changelog (v0.3.10) Co-Authored-By: Claude Opus 4.6 * fix: add missing blank line between resolver functions Co-Authored-By: Claude Opus 4.6 (1M context) * test: add 3 E2E smoke tests for base branch detection - /review: verifies Step 0 detection + git diff against detected base - /ship: truncated dry-run (Steps 0-1 only, no push/PR), asserts no destructive actions - /retro: verifies default branch detection for git log queries Covers the {{BASE_BRANCH_DETECT}} resolver path (review), the ship template's dual abort check, and retro's inline detection pattern. Co-Authored-By: Claude Opus 4.6 (1M context) * chore: bump version and changelog (v0.4.2) Co-Authored-By: Claude Opus 4.6 (1M context) --------- Co-authored-by: Claude Opus 4.6 (1M context) --- ARCHITECTURE.md | 2 + CHANGELOG.md | 14 +++ CLAUDE.md | 17 ++++ CONTRIBUTING.md | 2 + VERSION | 2 +- gstack-upgrade/SKILL.md | 8 +- gstack-upgrade/SKILL.md.tmpl | 8 +- plan-ceo-review/SKILL.md | 21 +++- plan-ceo-review/SKILL.md.tmpl | 4 +- qa/SKILL.md | 22 +++- qa/SKILL.md.tmpl | 5 +- retro/SKILL.md | 34 ++++--- retro/SKILL.md.tmpl | 34 ++++--- review/SKILL.md | 33 ++++-- review/SKILL.md.tmpl | 16 +-- scripts/gen-skill-docs.ts | 22 ++++ ship/SKILL.md | 49 ++++++--- ship/SKILL.md.tmpl | 32 +++--- test/gen-skill-docs.test.ts | 21 ++++ test/skill-e2e.test.ts | 187 ++++++++++++++++++++++++++++++++++ test/skill-validation.test.ts | 58 +++++++++++ 21 files changed, 514 insertions(+), 77 deletions(-) diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 5311c2cd..45768d07 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -200,6 +200,8 @@ Templates contain the workflows, tips, and examples that require human judgment. | `{{SNAPSHOT_FLAGS}}` | `snapshot.ts` | Flag reference with examples | | `{{PREAMBLE}}` | `gen-skill-docs.ts` | Startup block: update check, session tracking, contributor mode, AskUserQuestion format | | `{{BROWSE_SETUP}}` | `gen-skill-docs.ts` | Binary discovery + setup instructions | +| `{{BASE_BRANCH_DETECT}}` | `gen-skill-docs.ts` | Dynamic base branch detection for PR-targeting skills (ship, review, qa, plan-ceo-review) | +| `{{QA_METHODOLOGY}}` | `gen-skill-docs.ts` | Shared QA methodology block for /qa and /qa-only | This is structurally sound — if a command exists in code, it appears in docs. If it doesn't exist, it can't appear. diff --git a/CHANGELOG.md b/CHANGELOG.md index 57c2c1a0..3d67a917 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,19 @@ # Changelog +## 0.4.2 — 2026-03-16 + +- **Skills now respect your branch target.** `/ship`, `/review`, `/qa`, and `/plan-ceo-review` detect which branch your PR actually targets instead of assuming `main`. Stacked branches, Conductor workspaces targeting feature branches, and repos using `master` all just work now. +- **`/retro` works on any default branch.** Repos using `master`, `develop`, or other default branch names are detected automatically — no more empty retros because the branch name was wrong. +- **New `{{BASE_BRANCH_DETECT}}` placeholder** for skill authors — drop it into any template and get 3-step branch detection (PR base → repo default → fallback) for free. +- **3 new E2E smoke tests** validate base branch detection works end-to-end across ship, review, and retro skills. + +### For contributors + +- Added "Writing SKILL templates" section to CLAUDE.md — rules for natural language over bash-isms, dynamic branch detection, self-contained code blocks. +- Hardcoded-main regression test scans all `.tmpl` files for git commands with hardcoded `main`. +- QA template cleaned up: removed `REPORT_DIR` shell variable, simplified port detection to prose. +- gstack-upgrade template: explicit cross-step prose for variable references between bash blocks. + ## 0.4.1 — 2026-03-16 - **gstack now notices when it screws up.** Turn on contributor mode (`gstack-config set gstack_contributor true`) and gstack automatically writes up what went wrong — what you were doing, what broke, repro steps. Next time something annoys you, the bug report is already written. Fork gstack and fix it yourself. diff --git a/CLAUDE.md b/CLAUDE.md index e724b826..bc21f606 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -65,6 +65,23 @@ SKILL.md files are **generated** from `.tmpl` templates. To update docs: To add a new browse command: add it to `browse/src/commands.ts` and rebuild. To add a snapshot flag: add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts` and rebuild. +## Writing SKILL templates + +SKILL.md.tmpl files are **prompt templates read by Claude**, not bash scripts. +Each bash code block runs in a separate shell — variables do not persist between blocks. + +Rules: +- **Use natural language for logic and state.** Don't use shell variables to pass + state between code blocks. Instead, tell Claude what to remember and reference + it in prose (e.g., "the base branch detected in Step 0"). +- **Don't hardcode branch names.** Detect `main`/`master`/etc dynamically via + `gh pr view` or `gh repo view`. Use `{{BASE_BRANCH_DETECT}}` for PR-targeting + skills. Use "the base branch" in prose, `` in code block placeholders. +- **Keep bash blocks self-contained.** Each code block should work independently. + If a block needs context from a previous step, restate it in the prose above. +- **Express conditionals as English.** Instead of nested `if/elif/else` in bash, + write numbered decision steps: "1. If X, do Y. 2. Otherwise, do Z." + ## Browser interaction When you need to interact with a browser (QA, dogfooding, cookie setup), use the diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index abe1cf16..b06b837e 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -217,6 +217,8 @@ bun run skill:check bun run dev:skill ``` +For template authoring best practices (natural language over bash-isms, dynamic branch detection, `{{BASE_BRANCH_DETECT}}` usage), see CLAUDE.md's "Writing SKILL templates" section. + To add a browse command, add it to `browse/src/commands.ts`. To add a snapshot flag, add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts`. Then rebuild. ## Conductor workspaces diff --git a/VERSION b/VERSION index 267577d4..2b7c5ae0 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -0.4.1 +0.4.2 diff --git a/gstack-upgrade/SKILL.md b/gstack-upgrade/SKILL.md index 1cf7d548..42f13f69 100644 --- a/gstack-upgrade/SKILL.md +++ b/gstack-upgrade/SKILL.md @@ -94,14 +94,20 @@ fi echo "Install type: $INSTALL_TYPE at $INSTALL_DIR" ``` +The install type and directory path printed above will be used in all subsequent steps. + ### Step 3: Save old version +Use the install directory from Step 2's output below: + ```bash OLD_VERSION=$(cat "$INSTALL_DIR/VERSION" 2>/dev/null || echo "unknown") ``` ### Step 4: Upgrade +Use the install type and directory detected in Step 2: + **For git installs** (global-git, local-git): ```bash cd "$INSTALL_DIR" @@ -125,7 +131,7 @@ rm -rf "$INSTALL_DIR.bak" "$TMP_DIR" ### Step 4.5: Sync local vendored copy -After upgrading the primary install, check if there's also a local copy in the current project that needs updating: +Use the install directory from Step 2. Check if there's also a local vendored copy that needs updating: ```bash _ROOT=$(git rev-parse --show-toplevel 2>/dev/null) diff --git a/gstack-upgrade/SKILL.md.tmpl b/gstack-upgrade/SKILL.md.tmpl index 4a124be1..a199db6c 100644 --- a/gstack-upgrade/SKILL.md.tmpl +++ b/gstack-upgrade/SKILL.md.tmpl @@ -92,14 +92,20 @@ fi echo "Install type: $INSTALL_TYPE at $INSTALL_DIR" ``` +The install type and directory path printed above will be used in all subsequent steps. + ### Step 3: Save old version +Use the install directory from Step 2's output below: + ```bash OLD_VERSION=$(cat "$INSTALL_DIR/VERSION" 2>/dev/null || echo "unknown") ``` ### Step 4: Upgrade +Use the install type and directory detected in Step 2: + **For git installs** (global-git, local-git): ```bash cd "$INSTALL_DIR" @@ -123,7 +129,7 @@ rm -rf "$INSTALL_DIR.bak" "$TMP_DIR" ### Step 4.5: Sync local vendored copy -After upgrading the primary install, check if there's also a local copy in the current project that needs updating: +Use the install directory from Step 2. Check if there's also a local vendored copy that needs updating: ```bash _ROOT=$(git rev-parse --show-toplevel 2>/dev/null) diff --git a/plan-ceo-review/SKILL.md b/plan-ceo-review/SKILL.md index c82753a2..77ca6438 100644 --- a/plan-ceo-review/SKILL.md +++ b/plan-ceo-review/SKILL.md @@ -73,6 +73,25 @@ Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-log Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +## Step 0: Detect base branch + +Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps. + +1. Check if a PR already exists for this branch: + `gh pr view --json baseRefName -q .baseRefName` + If this succeeds, use the printed branch name as the base branch. + +2. If no PR exists (command fails), detect the repo's default branch: + `gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +3. If both commands fail, fall back to `main`. + +Print the detected base branch name. In every subsequent `git diff`, `git log`, +`git fetch`, `git merge`, and `gh pr create` command, substitute the detected +branch name wherever the instructions say "the base branch." + +--- + # Mega Plan Review Mode ## Philosophy @@ -117,7 +136,7 @@ Before doing anything else, run a system audit. This is not the plan review — Run the following commands: ``` git log --oneline -30 # Recent history -git diff main --stat # What's already changed +git diff --stat # What's already changed git stash list # Any stashed work grep -r "TODO\|FIXME\|HACK\|XXX" --include="*.rb" --include="*.js" -l find . -name "*.rb" -newer Gemfile.lock | head -20 # Recently touched files diff --git a/plan-ceo-review/SKILL.md.tmpl b/plan-ceo-review/SKILL.md.tmpl index ef14a281..9902fafb 100644 --- a/plan-ceo-review/SKILL.md.tmpl +++ b/plan-ceo-review/SKILL.md.tmpl @@ -16,6 +16,8 @@ allowed-tools: {{PREAMBLE}} +{{BASE_BRANCH_DETECT}} + # Mega Plan Review Mode ## Philosophy @@ -60,7 +62,7 @@ Before doing anything else, run a system audit. This is not the plan review — Run the following commands: ``` git log --oneline -30 # Recent history -git diff main --stat # What's already changed +git diff --stat # What's already changed git stash list # Any stashed work grep -r "TODO\|FIXME\|HACK\|XXX" --include="*.rb" --include="*.js" -l find . -name "*.rb" -newer Gemfile.lock | head -20 # Recently touched files diff --git a/qa/SKILL.md b/qa/SKILL.md index c11f8a66..5ea4643a 100644 --- a/qa/SKILL.md +++ b/qa/SKILL.md @@ -77,6 +77,25 @@ Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-log Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +## Step 0: Detect base branch + +Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps. + +1. Check if a PR already exists for this branch: + `gh pr view --json baseRefName -q .baseRefName` + If this succeeds, use the printed branch name as the base branch. + +2. If no PR exists (command fails), detect the repo's default branch: + `gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +3. If both commands fail, fall back to `main`. + +Print the detected base branch name. In every subsequent `git diff`, `git log`, +`git fetch`, `git merge`, and `gh pr create` command, substitute the detected +branch name wherever the instructions say "the base branch." + +--- + # /qa: Test → Fix → Verify You are a QA engineer AND a bug-fix engineer. Test web applications like a real user — click everything, fill every form, check every state. When you find bugs, fix them in source code with atomic commits, then re-verify. Produce a structured report with before/after evidence. @@ -133,8 +152,7 @@ If `NEEDS_SETUP`: **Create output directories:** ```bash -REPORT_DIR=".gstack/qa-reports" -mkdir -p "$REPORT_DIR/screenshots" +mkdir -p .gstack/qa-reports/screenshots ``` --- diff --git a/qa/SKILL.md.tmpl b/qa/SKILL.md.tmpl index a3e5a9f0..f491999e 100644 --- a/qa/SKILL.md.tmpl +++ b/qa/SKILL.md.tmpl @@ -20,6 +20,8 @@ allowed-tools: {{PREAMBLE}} +{{BASE_BRANCH_DETECT}} + # /qa: Test → Fix → Verify You are a QA engineer AND a bug-fix engineer. Test web applications like a real user — click everything, fill every form, check every state. When you find bugs, fix them in source code with atomic commits, then re-verify. Produce a structured report with before/after evidence. @@ -59,8 +61,7 @@ fi **Create output directories:** ```bash -REPORT_DIR=".gstack/qa-reports" -mkdir -p "$REPORT_DIR/screenshots" +mkdir -p .gstack/qa-reports/screenshots ``` --- diff --git a/retro/SKILL.md b/retro/SKILL.md index 28280c94..ca44f49e 100644 --- a/retro/SKILL.md +++ b/retro/SKILL.md @@ -72,6 +72,16 @@ Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-log Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +## Detect default branch + +Before gathering data, detect the repo's default branch name: +`gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +If this fails, fall back to `main`. Use the detected name wherever the instructions +say `origin/` below. + +--- + # /retro — Weekly Engineering Retrospective Generates a comprehensive engineering retrospective analyzing commit history, work patterns, and code quality metrics. Team-aware: identifies the user running the command, then analyzes every contributor with per-person praise and growth opportunities. Designed for a senior IC/CTO-level builder using Claude Code as a force multiplier. @@ -106,7 +116,7 @@ Usage: /retro [window] First, fetch origin and identify the current user: ```bash -git fetch origin main --quiet +git fetch origin --quiet # Identify who is running the retro git config user.name git config user.email @@ -118,28 +128,28 @@ Run ALL of these git commands in parallel (they are independent): ```bash # 1. All commits in window with timestamps, subject, hash, AUTHOR, files changed, insertions, deletions -git log origin/main --since="" --format="%H|%aN|%ae|%ai|%s" --shortstat +git log origin/ --since="" --format="%H|%aN|%ae|%ai|%s" --shortstat # 2. Per-commit test vs total LOC breakdown with author # Each commit block starts with COMMIT:|, followed by numstat lines. # Separate test files (matching test/|spec/|__tests__/) from production files. -git log origin/main --since="" --format="COMMIT:%H|%aN" --numstat +git log origin/ --since="" --format="COMMIT:%H|%aN" --numstat # 3. Commit timestamps for session detection and hourly distribution (with author) # Use TZ=America/Los_Angeles for Pacific time conversion -TZ=America/Los_Angeles git log origin/main --since="" --format="%at|%aN|%ai|%s" | sort -n +TZ=America/Los_Angeles git log origin/ --since="" --format="%at|%aN|%ai|%s" | sort -n # 4. Files most frequently changed (hotspot analysis) -git log origin/main --since="" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn +git log origin/ --since="" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn # 5. PR numbers from commit messages (extract #NNN patterns) -git log origin/main --since="" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/' +git log origin/ --since="" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/' # 6. Per-author file hotspots (who touches what) -git log origin/main --since="" --format="AUTHOR:%aN" --name-only +git log origin/ --since="" --format="AUTHOR:%aN" --name-only # 7. Per-author commit counts (quick summary) -git shortlog origin/main --since="" -sn --no-merges +git shortlog origin/ --since="" -sn --no-merges # 8. Greptile triage history (if available) cat ~/.gstack/greptile-history.md 2>/dev/null || true @@ -298,14 +308,14 @@ If the time window is 14 days or more, split into weekly buckets and show trends ### Step 11: Streak Tracking -Count consecutive days with at least 1 commit to origin/main, going back from today. Track both team streak and personal streak: +Count consecutive days with at least 1 commit to origin/, going back from today. Track both team streak and personal streak: ```bash # Team streak: all unique commit dates (Pacific time) — no hard cutoff -TZ=America/Los_Angeles git log origin/main --format="%ad" --date=format:"%Y-%m-%d" | sort -u +TZ=America/Los_Angeles git log origin/ --format="%ad" --date=format:"%Y-%m-%d" | sort -u # Personal streak: only the current user's commits -TZ=America/Los_Angeles git log origin/main --author="" --format="%ad" --date=format:"%Y-%m-%d" | sort -u +TZ=America/Los_Angeles git log origin/ --author="" --format="%ad" --date=format:"%Y-%m-%d" | sort -u ``` Count backward from today — how many consecutive days have at least one commit? This queries the full history so streaks of any length are reported accurately. Display both: @@ -523,7 +533,7 @@ When the user runs `/retro compare` (or `/retro compare 14d`): ## Important Rules - ALL narrative output goes directly to the user in the conversation. The ONLY file written is the `.context/retros/` JSON snapshot. -- Use `origin/main` for all git queries (not local main which may be stale) +- Use `origin/` for all git queries (not local main which may be stale) - Convert all timestamps to Pacific time for display (use `TZ=America/Los_Angeles`) - If the window has zero commits, say so and suggest a different window - Round LOC/hour to nearest 50 diff --git a/retro/SKILL.md.tmpl b/retro/SKILL.md.tmpl index 07e08885..2f39fb5c 100644 --- a/retro/SKILL.md.tmpl +++ b/retro/SKILL.md.tmpl @@ -15,6 +15,16 @@ allowed-tools: {{PREAMBLE}} +## Detect default branch + +Before gathering data, detect the repo's default branch name: +`gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +If this fails, fall back to `main`. Use the detected name wherever the instructions +say `origin/` below. + +--- + # /retro — Weekly Engineering Retrospective Generates a comprehensive engineering retrospective analyzing commit history, work patterns, and code quality metrics. Team-aware: identifies the user running the command, then analyzes every contributor with per-person praise and growth opportunities. Designed for a senior IC/CTO-level builder using Claude Code as a force multiplier. @@ -49,7 +59,7 @@ Usage: /retro [window] First, fetch origin and identify the current user: ```bash -git fetch origin main --quiet +git fetch origin --quiet # Identify who is running the retro git config user.name git config user.email @@ -61,28 +71,28 @@ Run ALL of these git commands in parallel (they are independent): ```bash # 1. All commits in window with timestamps, subject, hash, AUTHOR, files changed, insertions, deletions -git log origin/main --since="" --format="%H|%aN|%ae|%ai|%s" --shortstat +git log origin/ --since="" --format="%H|%aN|%ae|%ai|%s" --shortstat # 2. Per-commit test vs total LOC breakdown with author # Each commit block starts with COMMIT:|, followed by numstat lines. # Separate test files (matching test/|spec/|__tests__/) from production files. -git log origin/main --since="" --format="COMMIT:%H|%aN" --numstat +git log origin/ --since="" --format="COMMIT:%H|%aN" --numstat # 3. Commit timestamps for session detection and hourly distribution (with author) # Use TZ=America/Los_Angeles for Pacific time conversion -TZ=America/Los_Angeles git log origin/main --since="" --format="%at|%aN|%ai|%s" | sort -n +TZ=America/Los_Angeles git log origin/ --since="" --format="%at|%aN|%ai|%s" | sort -n # 4. Files most frequently changed (hotspot analysis) -git log origin/main --since="" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn +git log origin/ --since="" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn # 5. PR numbers from commit messages (extract #NNN patterns) -git log origin/main --since="" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/' +git log origin/ --since="" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/' # 6. Per-author file hotspots (who touches what) -git log origin/main --since="" --format="AUTHOR:%aN" --name-only +git log origin/ --since="" --format="AUTHOR:%aN" --name-only # 7. Per-author commit counts (quick summary) -git shortlog origin/main --since="" -sn --no-merges +git shortlog origin/ --since="" -sn --no-merges # 8. Greptile triage history (if available) cat ~/.gstack/greptile-history.md 2>/dev/null || true @@ -241,14 +251,14 @@ If the time window is 14 days or more, split into weekly buckets and show trends ### Step 11: Streak Tracking -Count consecutive days with at least 1 commit to origin/main, going back from today. Track both team streak and personal streak: +Count consecutive days with at least 1 commit to origin/, going back from today. Track both team streak and personal streak: ```bash # Team streak: all unique commit dates (Pacific time) — no hard cutoff -TZ=America/Los_Angeles git log origin/main --format="%ad" --date=format:"%Y-%m-%d" | sort -u +TZ=America/Los_Angeles git log origin/ --format="%ad" --date=format:"%Y-%m-%d" | sort -u # Personal streak: only the current user's commits -TZ=America/Los_Angeles git log origin/main --author="" --format="%ad" --date=format:"%Y-%m-%d" | sort -u +TZ=America/Los_Angeles git log origin/ --author="" --format="%ad" --date=format:"%Y-%m-%d" | sort -u ``` Count backward from today — how many consecutive days have at least one commit? This queries the full history so streaks of any length are reported accurately. Display both: @@ -466,7 +476,7 @@ When the user runs `/retro compare` (or `/retro compare 14d`): ## Important Rules - ALL narrative output goes directly to the user in the conversation. The ONLY file written is the `.context/retros/` JSON snapshot. -- Use `origin/main` for all git queries (not local main which may be stale) +- Use `origin/` for all git queries (not local main which may be stale) - Convert all timestamps to Pacific time for display (use `TZ=America/Los_Angeles`) - If the window has zero commits, say so and suggest a different window - Round LOC/hour to nearest 50 diff --git a/review/SKILL.md b/review/SKILL.md index 32c597a3..949a0c65 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -2,7 +2,7 @@ name: review version: 1.0.0 description: | - Pre-landing PR review. Analyzes diff against main for SQL safety, LLM trust + Pre-landing PR review. Analyzes diff against the base branch for SQL safety, LLM trust boundary violations, conditional side effects, and other structural issues. allowed-tools: - Bash @@ -73,17 +73,36 @@ Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-log Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +## Step 0: Detect base branch + +Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps. + +1. Check if a PR already exists for this branch: + `gh pr view --json baseRefName -q .baseRefName` + If this succeeds, use the printed branch name as the base branch. + +2. If no PR exists (command fails), detect the repo's default branch: + `gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +3. If both commands fail, fall back to `main`. + +Print the detected base branch name. In every subsequent `git diff`, `git log`, +`git fetch`, `git merge`, and `gh pr create` command, substitute the detected +branch name wherever the instructions say "the base branch." + +--- + # Pre-Landing PR Review -You are running the `/review` workflow. Analyze the current branch's diff against main for structural issues that tests don't catch. +You are running the `/review` workflow. Analyze the current branch's diff against the base branch for structural issues that tests don't catch. --- ## Step 1: Check branch 1. Run `git branch --show-current` to get the current branch. -2. If on `main`, output: **"Nothing to review — you're on main or have no changes against main."** and stop. -3. Run `git fetch origin main --quiet && git diff origin/main --stat` to check if there's a diff. If no diff, output the same message and stop. +2. If on the base branch, output: **"Nothing to review — you're on the base branch or have no changes against it."** and stop. +3. Run `git fetch origin --quiet && git diff origin/ --stat` to check if there's a diff. If no diff, output the same message and stop. --- @@ -107,13 +126,13 @@ Read `.claude/skills/review/greptile-triage.md` and follow the fetch, filter, cl ## Step 3: Get the diff -Fetch the latest main to avoid false positives from a stale local main: +Fetch the latest base branch to avoid false positives from stale local state: ```bash -git fetch origin main --quiet +git fetch origin --quiet ``` -Run `git diff origin/main` to get the full diff. This includes both committed and uncommitted changes against the latest main. +Run `git diff origin/` to get the full diff. This includes both committed and uncommitted changes against the latest base branch. --- diff --git a/review/SKILL.md.tmpl b/review/SKILL.md.tmpl index 124a5393..dadd211a 100644 --- a/review/SKILL.md.tmpl +++ b/review/SKILL.md.tmpl @@ -2,7 +2,7 @@ name: review version: 1.0.0 description: | - Pre-landing PR review. Analyzes diff against main for SQL safety, LLM trust + Pre-landing PR review. Analyzes diff against the base branch for SQL safety, LLM trust boundary violations, conditional side effects, and other structural issues. allowed-tools: - Bash @@ -16,17 +16,19 @@ allowed-tools: {{PREAMBLE}} +{{BASE_BRANCH_DETECT}} + # Pre-Landing PR Review -You are running the `/review` workflow. Analyze the current branch's diff against main for structural issues that tests don't catch. +You are running the `/review` workflow. Analyze the current branch's diff against the base branch for structural issues that tests don't catch. --- ## Step 1: Check branch 1. Run `git branch --show-current` to get the current branch. -2. If on `main`, output: **"Nothing to review — you're on main or have no changes against main."** and stop. -3. Run `git fetch origin main --quiet && git diff origin/main --stat` to check if there's a diff. If no diff, output the same message and stop. +2. If on the base branch, output: **"Nothing to review — you're on the base branch or have no changes against it."** and stop. +3. Run `git fetch origin --quiet && git diff origin/ --stat` to check if there's a diff. If no diff, output the same message and stop. --- @@ -50,13 +52,13 @@ Read `.claude/skills/review/greptile-triage.md` and follow the fetch, filter, cl ## Step 3: Get the diff -Fetch the latest main to avoid false positives from a stale local main: +Fetch the latest base branch to avoid false positives from stale local state: ```bash -git fetch origin main --quiet +git fetch origin --quiet ``` -Run `git diff origin/main` to get the full diff. This includes both committed and uncommitted changes against the latest main. +Run `git diff origin/` to get the full diff. This includes both committed and uncommitted changes against the latest base branch. --- diff --git a/scripts/gen-skill-docs.ts b/scripts/gen-skill-docs.ts index bafed642..9c81e968 100644 --- a/scripts/gen-skill-docs.ts +++ b/scripts/gen-skill-docs.ts @@ -174,6 +174,27 @@ If \`NEEDS_SETUP\`: 3. If \`bun\` is not installed: \`curl -fsSL https://bun.sh/install | bash\``; } +function generateBaseBranchDetect(): string { + return `## Step 0: Detect base branch + +Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps. + +1. Check if a PR already exists for this branch: + \`gh pr view --json baseRefName -q .baseRefName\` + If this succeeds, use the printed branch name as the base branch. + +2. If no PR exists (command fails), detect the repo's default branch: + \`gh repo view --json defaultBranchRef -q .defaultBranchRef.name\` + +3. If both commands fail, fall back to \`main\`. + +Print the detected base branch name. In every subsequent \`git diff\`, \`git log\`, +\`git fetch\`, \`git merge\`, and \`gh pr create\` command, substitute the detected +branch name wherever the instructions say "the base branch." + +---`; +} + function generateQAMethodology(): string { return `## Modes @@ -455,6 +476,7 @@ const RESOLVERS: Record string> = { SNAPSHOT_FLAGS: generateSnapshotFlags, PREAMBLE: generatePreamble, BROWSE_SETUP: generateBrowseSetup, + BASE_BRANCH_DETECT: generateBaseBranchDetect, QA_METHODOLOGY: generateQAMethodology, }; diff --git a/ship/SKILL.md b/ship/SKILL.md index e023816d..47a9da11 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -2,7 +2,7 @@ name: ship version: 1.0.0 description: | - Ship workflow: merge main, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. + Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. allowed-tools: - Bash - Read @@ -72,12 +72,31 @@ Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-log Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +## Step 0: Detect base branch + +Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps. + +1. Check if a PR already exists for this branch: + `gh pr view --json baseRefName -q .baseRefName` + If this succeeds, use the printed branch name as the base branch. + +2. If no PR exists (command fails), detect the repo's default branch: + `gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +3. If both commands fail, fall back to `main`. + +Print the detected base branch name. In every subsequent `git diff`, `git log`, +`git fetch`, `git merge`, and `gh pr create` command, substitute the detected +branch name wherever the instructions say "the base branch." + +--- + # Ship: Fully Automated Ship Workflow You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end. **Only stop for:** -- On `main` branch (abort) +- On the base branch (abort) - Merge conflicts that can't be auto-resolved (stop, show conflicts) - Test failures (stop, show failures) - Pre-landing review finds CRITICAL issues and user chooses to fix (not acknowledge or skip) @@ -98,20 +117,20 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat ## Step 1: Pre-flight -1. Check the current branch. If on `main`, **abort**: "You're on main. Ship from a feature branch." +1. Check the current branch. If on the base branch or the repo's default branch, **abort**: "You're on the base branch. Ship from a feature branch." 2. Run `git status` (never use `-uall`). Uncommitted changes are always included — no need to ask. -3. Run `git diff main...HEAD --stat` and `git log main..HEAD --oneline` to understand what's being shipped. +3. Run `git diff ...HEAD --stat` and `git log ..HEAD --oneline` to understand what's being shipped. --- -## Step 2: Merge origin/main (BEFORE tests) +## Step 2: Merge the base branch (BEFORE tests) -Fetch and merge `origin/main` into the feature branch so tests run against the merged state: +Fetch and merge the base branch into the feature branch so tests run against the merged state: ```bash -git fetch origin main && git merge origin/main --no-edit +git fetch origin && git merge origin/ --no-edit ``` **If there are merge conflicts:** Try to auto-resolve if they are simple (VERSION, schema.rb, CHANGELOG ordering). If conflicts are complex or ambiguous, **STOP** and show them. @@ -149,7 +168,7 @@ Evals are mandatory when prompt-related files change. Skip this step entirely if **1. Check if the diff touches prompt-related files:** ```bash -git diff origin/main --name-only +git diff origin/ --name-only ``` Match against these patterns (from CLAUDE.md): @@ -210,7 +229,7 @@ Review the diff for structural issues that tests don't catch. 1. Read `.claude/skills/review/checklist.md`. If the file cannot be read, **STOP** and report the error. -2. Run `git diff origin/main` to get the full diff (scoped to feature changes against the freshly-fetched remote main). +2. Run `git diff origin/` to get the full diff (scoped to feature changes against the freshly-fetched base branch). 3. Apply the review checklist in two passes: - **Pass 1 (CRITICAL):** SQL & Data Safety, LLM Output Trust Boundary @@ -278,7 +297,7 @@ For each classified comment: 1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`) 2. **Auto-decide the bump level based on the diff:** - - Count lines changed (`git diff origin/main...HEAD --stat | tail -1`) + - Count lines changed (`git diff origin/...HEAD --stat | tail -1`) - **MICRO** (4th digit): < 50 lines changed, trivial tweaks, typos, config - **PATCH** (3rd digit): 50+ lines changed, bug fixes, small-medium features - **MINOR** (2nd digit): **ASK the user** — only for major features or significant architectural changes @@ -297,8 +316,8 @@ For each classified comment: 1. Read `CHANGELOG.md` header to know the format. 2. Auto-generate the entry from **ALL commits on the branch** (not just recent ones): - - Use `git log main..HEAD --oneline` to see every commit being shipped - - Use `git diff main...HEAD` to see the full diff against main + - Use `git log ..HEAD --oneline` to see every commit being shipped + - Use `git diff ...HEAD` to see the full diff against the base branch - The CHANGELOG entry must be comprehensive of ALL changes going into the PR - If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version - Categorize changes into applicable sections: @@ -346,8 +365,8 @@ Read TODOS.md and verify it follows the recommended structure: This step is fully automatic — no user interaction. Use the diff and commit history already gathered in earlier steps: -- `git diff main...HEAD` (full diff against main) -- `git log main..HEAD --oneline` (all commits being shipped) +- `git diff ...HEAD` (full diff against the base branch) +- `git log ..HEAD --oneline` (all commits being shipped) For each TODO item, check if the changes in this PR complete it by: - Matching commit messages against the TODO title and description @@ -422,7 +441,7 @@ git push -u origin Create a pull request using `gh`: ```bash -gh pr create --title ": " --body "$(cat <<'EOF' +gh pr create --base --title ": " --body "$(cat <<'EOF' ## Summary diff --git a/ship/SKILL.md.tmpl b/ship/SKILL.md.tmpl index 06ff5a07..ae5df404 100644 --- a/ship/SKILL.md.tmpl +++ b/ship/SKILL.md.tmpl @@ -2,7 +2,7 @@ name: ship version: 1.0.0 description: | - Ship workflow: merge main, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. + Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. allowed-tools: - Bash - Read @@ -15,12 +15,14 @@ allowed-tools: {{PREAMBLE}} +{{BASE_BRANCH_DETECT}} + # Ship: Fully Automated Ship Workflow You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end. **Only stop for:** -- On `main` branch (abort) +- On the base branch (abort) - Merge conflicts that can't be auto-resolved (stop, show conflicts) - Test failures (stop, show failures) - Pre-landing review finds CRITICAL issues and user chooses to fix (not acknowledge or skip) @@ -41,20 +43,20 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat ## Step 1: Pre-flight -1. Check the current branch. If on `main`, **abort**: "You're on main. Ship from a feature branch." +1. Check the current branch. If on the base branch or the repo's default branch, **abort**: "You're on the base branch. Ship from a feature branch." 2. Run `git status` (never use `-uall`). Uncommitted changes are always included — no need to ask. -3. Run `git diff main...HEAD --stat` and `git log main..HEAD --oneline` to understand what's being shipped. +3. Run `git diff ...HEAD --stat` and `git log ..HEAD --oneline` to understand what's being shipped. --- -## Step 2: Merge origin/main (BEFORE tests) +## Step 2: Merge the base branch (BEFORE tests) -Fetch and merge `origin/main` into the feature branch so tests run against the merged state: +Fetch and merge the base branch into the feature branch so tests run against the merged state: ```bash -git fetch origin main && git merge origin/main --no-edit +git fetch origin && git merge origin/ --no-edit ``` **If there are merge conflicts:** Try to auto-resolve if they are simple (VERSION, schema.rb, CHANGELOG ordering). If conflicts are complex or ambiguous, **STOP** and show them. @@ -92,7 +94,7 @@ Evals are mandatory when prompt-related files change. Skip this step entirely if **1. Check if the diff touches prompt-related files:** ```bash -git diff origin/main --name-only +git diff origin/ --name-only ``` Match against these patterns (from CLAUDE.md): @@ -153,7 +155,7 @@ Review the diff for structural issues that tests don't catch. 1. Read `.claude/skills/review/checklist.md`. If the file cannot be read, **STOP** and report the error. -2. Run `git diff origin/main` to get the full diff (scoped to feature changes against the freshly-fetched remote main). +2. Run `git diff origin/` to get the full diff (scoped to feature changes against the freshly-fetched base branch). 3. Apply the review checklist in two passes: - **Pass 1 (CRITICAL):** SQL & Data Safety, LLM Output Trust Boundary @@ -221,7 +223,7 @@ For each classified comment: 1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`) 2. **Auto-decide the bump level based on the diff:** - - Count lines changed (`git diff origin/main...HEAD --stat | tail -1`) + - Count lines changed (`git diff origin/...HEAD --stat | tail -1`) - **MICRO** (4th digit): < 50 lines changed, trivial tweaks, typos, config - **PATCH** (3rd digit): 50+ lines changed, bug fixes, small-medium features - **MINOR** (2nd digit): **ASK the user** — only for major features or significant architectural changes @@ -240,8 +242,8 @@ For each classified comment: 1. Read `CHANGELOG.md` header to know the format. 2. Auto-generate the entry from **ALL commits on the branch** (not just recent ones): - - Use `git log main..HEAD --oneline` to see every commit being shipped - - Use `git diff main...HEAD` to see the full diff against main + - Use `git log ..HEAD --oneline` to see every commit being shipped + - Use `git diff ...HEAD` to see the full diff against the base branch - The CHANGELOG entry must be comprehensive of ALL changes going into the PR - If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version - Categorize changes into applicable sections: @@ -289,8 +291,8 @@ Read TODOS.md and verify it follows the recommended structure: This step is fully automatic — no user interaction. Use the diff and commit history already gathered in earlier steps: -- `git diff main...HEAD` (full diff against main) -- `git log main..HEAD --oneline` (all commits being shipped) +- `git diff ...HEAD` (full diff against the base branch) +- `git log ..HEAD --oneline` (all commits being shipped) For each TODO item, check if the changes in this PR complete it by: - Matching commit messages against the TODO title and description @@ -365,7 +367,7 @@ git push -u origin Create a pull request using `gh`: ```bash -gh pr create --title ": " --body "$(cat <<'EOF' +gh pr create --base --title ": " --body "$(cat <<'EOF' ## Summary diff --git a/test/gen-skill-docs.test.ts b/test/gen-skill-docs.test.ts index a2499af6..e77989f0 100644 --- a/test/gen-skill-docs.test.ts +++ b/test/gen-skill-docs.test.ts @@ -203,6 +203,27 @@ describe('gen-skill-docs', () => { }); }); +describe('BASE_BRANCH_DETECT resolver', () => { + // Find a generated SKILL.md that uses the placeholder (ship is guaranteed to) + const shipContent = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8'); + + test('resolver output contains PR base detection command', () => { + expect(shipContent).toContain('gh pr view --json baseRefName'); + }); + + test('resolver output contains repo default branch detection command', () => { + expect(shipContent).toContain('gh repo view --json defaultBranchRef'); + }); + + test('resolver output contains fallback to main', () => { + expect(shipContent).toMatch(/fall\s*back\s+to\s+`main`/i); + }); + + test('resolver output uses "the base branch" phrasing', () => { + expect(shipContent).toContain('the base branch'); + }); +}); + /** * Quality evals — catch description regressions. * diff --git a/test/skill-e2e.test.ts b/test/skill-e2e.test.ts index 8ce1f40f..4978ce53 100644 --- a/test/skill-e2e.test.ts +++ b/test/skill-e2e.test.ts @@ -1344,6 +1344,193 @@ Write your review to ${planDir}/review-output.md`, }, 420_000); }); +// --- Base branch detection smoke tests --- + +describeE2E('Base branch detection', () => { + let baseBranchDir: string; + const run = (cmd: string, args: string[], cwd: string) => + spawnSync(cmd, args, { cwd, stdio: 'pipe', timeout: 5000 }); + + beforeAll(() => { + baseBranchDir = fs.mkdtempSync(path.join(os.tmpdir(), 'skill-e2e-basebranch-')); + }); + + afterAll(() => { + try { fs.rmSync(baseBranchDir, { recursive: true, force: true }); } catch {} + }); + + test('/review detects base branch and diffs against it', async () => { + const dir = path.join(baseBranchDir, 'review-base'); + fs.mkdirSync(dir, { recursive: true }); + + // Create git repo with a feature branch off main + run('git', ['init'], dir); + run('git', ['config', 'user.email', 'test@test.com'], dir); + run('git', ['config', 'user.name', 'Test'], dir); + + fs.writeFileSync(path.join(dir, 'app.rb'), '# clean base\nclass App\nend\n'); + run('git', ['add', 'app.rb'], dir); + run('git', ['commit', '-m', 'initial commit'], dir); + + // Create feature branch with a change + run('git', ['checkout', '-b', 'feature/test-review'], dir); + fs.writeFileSync(path.join(dir, 'app.rb'), '# clean base\nclass App\n def hello; "world"; end\nend\n'); + run('git', ['add', 'app.rb'], dir); + run('git', ['commit', '-m', 'feat: add hello method'], dir); + + // Copy review skill files + fs.copyFileSync(path.join(ROOT, 'review', 'SKILL.md'), path.join(dir, 'review-SKILL.md')); + fs.copyFileSync(path.join(ROOT, 'review', 'checklist.md'), path.join(dir, 'review-checklist.md')); + fs.copyFileSync(path.join(ROOT, 'review', 'greptile-triage.md'), path.join(dir, 'review-greptile-triage.md')); + + const result = await runSkillTest({ + prompt: `You are in a git repo on a feature branch with changes. +Read review-SKILL.md for the review workflow instructions. +Also read review-checklist.md and apply it. + +IMPORTANT: Follow Step 0 to detect the base branch. Since there is no remote, gh commands will fail — fall back to main. +Then run the review against the detected base branch. +Write your findings to ${dir}/review-output.md`, + workingDirectory: dir, + maxTurns: 15, + timeout: 90_000, + testName: 'review-base-branch', + runId, + }); + + logCost('/review base-branch', result); + recordE2E('/review base branch detection', 'Base branch detection', result); + expect(result.exitReason).toBe('success'); + + // Verify the review used "base branch" language (from Step 0) + const toolOutputs = result.toolCalls.map(tc => tc.output || '').join('\n'); + const allOutput = (result.output || '') + toolOutputs; + // The agent should have run git diff against main (the fallback) + const usedGitDiff = result.toolCalls.some(tc => + tc.tool === 'Bash' && typeof tc.input === 'string' && tc.input.includes('git diff') + ); + expect(usedGitDiff).toBe(true); + }, 120_000); + + test('/ship Step 0-1 detects base branch without destructive actions', async () => { + const dir = path.join(baseBranchDir, 'ship-base'); + fs.mkdirSync(dir, { recursive: true }); + + // Create git repo with feature branch + run('git', ['init'], dir); + run('git', ['config', 'user.email', 'test@test.com'], dir); + run('git', ['config', 'user.name', 'Test'], dir); + + fs.writeFileSync(path.join(dir, 'app.ts'), 'console.log("v1");\n'); + run('git', ['add', 'app.ts'], dir); + run('git', ['commit', '-m', 'initial'], dir); + + run('git', ['checkout', '-b', 'feature/ship-test'], dir); + fs.writeFileSync(path.join(dir, 'app.ts'), 'console.log("v2");\n'); + run('git', ['add', 'app.ts'], dir); + run('git', ['commit', '-m', 'feat: update to v2'], dir); + + // Copy ship skill + fs.copyFileSync(path.join(ROOT, 'ship', 'SKILL.md'), path.join(dir, 'ship-SKILL.md')); + + const result = await runSkillTest({ + prompt: `Read ship-SKILL.md for the ship workflow. + +Run ONLY Step 0 (Detect base branch) and Step 1 (Pre-flight) from the ship workflow. +Since there is no remote, gh commands will fail — fall back to main. + +After completing Step 0 and Step 1, STOP. Do NOT proceed to Step 2 or beyond. +Do NOT push, create PRs, or modify VERSION/CHANGELOG. + +Write a summary of what you detected to ${dir}/ship-preflight.md including: +- The detected base branch name +- The current branch name +- The diff stat against the base branch`, + workingDirectory: dir, + maxTurns: 10, + timeout: 60_000, + testName: 'ship-base-branch', + runId, + }); + + logCost('/ship base-branch', result); + recordE2E('/ship base branch detection', 'Base branch detection', result); + expect(result.exitReason).toBe('success'); + + // Verify preflight output was written + const preflightPath = path.join(dir, 'ship-preflight.md'); + if (fs.existsSync(preflightPath)) { + const content = fs.readFileSync(preflightPath, 'utf-8'); + expect(content.length).toBeGreaterThan(20); + // Should mention the branch name + expect(content.toLowerCase()).toMatch(/main|base/); + } + + // Verify no destructive actions — no push, no PR creation + const destructiveTools = result.toolCalls.filter(tc => + tc.tool === 'Bash' && typeof tc.input === 'string' && + (tc.input.includes('git push') || tc.input.includes('gh pr create')) + ); + expect(destructiveTools).toHaveLength(0); + }, 90_000); + + test('/retro detects default branch for git queries', async () => { + const dir = path.join(baseBranchDir, 'retro-base'); + fs.mkdirSync(dir, { recursive: true }); + + // Create git repo with commit history + run('git', ['init'], dir); + run('git', ['config', 'user.email', 'dev@example.com'], dir); + run('git', ['config', 'user.name', 'Dev'], dir); + + fs.writeFileSync(path.join(dir, 'app.ts'), 'console.log("hello");\n'); + run('git', ['add', 'app.ts'], dir); + run('git', ['commit', '-m', 'feat: initial app', '--date', '2026-03-14T09:00:00'], dir); + + fs.writeFileSync(path.join(dir, 'auth.ts'), 'export function login() {}\n'); + run('git', ['add', 'auth.ts'], dir); + run('git', ['commit', '-m', 'feat: add auth', '--date', '2026-03-15T10:00:00'], dir); + + fs.writeFileSync(path.join(dir, 'test.ts'), 'test("it works", () => {});\n'); + run('git', ['add', 'test.ts'], dir); + run('git', ['commit', '-m', 'test: add tests', '--date', '2026-03-16T11:00:00'], dir); + + // Copy retro skill + fs.mkdirSync(path.join(dir, 'retro'), { recursive: true }); + fs.copyFileSync(path.join(ROOT, 'retro', 'SKILL.md'), path.join(dir, 'retro', 'SKILL.md')); + + const result = await runSkillTest({ + prompt: `Read retro/SKILL.md for instructions on how to run a retrospective. + +IMPORTANT: Follow the "Detect default branch" step first. Since there is no remote, gh will fail — fall back to main. +Then use the detected branch name for all git queries. + +Run /retro for the last 7 days of this git repo. Skip any AskUserQuestion calls — this is non-interactive. +This is a local-only repo so use the local branch (main) instead of origin/main for all git log commands. + +Write your retrospective to ${dir}/retro-output.md`, + workingDirectory: dir, + maxTurns: 25, + timeout: 240_000, + testName: 'retro-base-branch', + runId, + }); + + logCost('/retro base-branch', result); + recordE2E('/retro default branch detection', 'Base branch detection', result, { + passed: ['success', 'error_max_turns'].includes(result.exitReason), + }); + expect(['success', 'error_max_turns']).toContain(result.exitReason); + + // Verify retro output was produced + const retroPath = path.join(dir, 'retro-output.md'); + if (fs.existsSync(retroPath)) { + const content = fs.readFileSync(retroPath, 'utf-8'); + expect(content.length).toBeGreaterThan(100); + } + }, 300_000); +}); + // --- Deferred skill E2E tests (destructive or require interactive UI) --- describeE2E('Deferred skill E2E', () => { diff --git a/test/skill-validation.test.ts b/test/skill-validation.test.ts index 88e98935..2a947b15 100644 --- a/test/skill-validation.test.ts +++ b/test/skill-validation.test.ts @@ -388,6 +388,64 @@ describe('Greptile history format consistency', () => { }); }); +// --- Hardcoded branch name detection in templates --- + +describe('No hardcoded branch names in SKILL templates', () => { + const tmplFiles = [ + 'ship/SKILL.md.tmpl', + 'review/SKILL.md.tmpl', + 'qa/SKILL.md.tmpl', + 'plan-ceo-review/SKILL.md.tmpl', + 'retro/SKILL.md.tmpl', + ]; + + // Patterns that indicate hardcoded 'main' in git commands + const gitMainPatterns = [ + /\bgit\s+diff\s+(?:origin\/)?main\b/, + /\bgit\s+log\s+(?:origin\/)?main\b/, + /\bgit\s+fetch\s+origin\s+main\b/, + /\bgit\s+merge\s+origin\/main\b/, + /\borigin\/main\b/, + ]; + + // Lines that are allowed to mention 'main' (fallback logic, prose) + const allowlist = [ + /fall\s*back\s+to\s+`main`/i, + /fall\s*back\s+to\s+`?main`?/i, + /typically\s+`?main`?/i, + /If\s+on\s+`main`/i, // old pattern — should not exist + ]; + + for (const tmplFile of tmplFiles) { + test(`${tmplFile} has no hardcoded 'main' in git commands`, () => { + const filePath = path.join(ROOT, tmplFile); + if (!fs.existsSync(filePath)) return; + const lines = fs.readFileSync(filePath, 'utf-8').split('\n'); + const violations: string[] = []; + + for (let i = 0; i < lines.length; i++) { + const line = lines[i]; + const isAllowlisted = allowlist.some(p => p.test(line)); + if (isAllowlisted) continue; + + for (const pattern of gitMainPatterns) { + if (pattern.test(line)) { + violations.push(`Line ${i + 1}: ${line.trim()}`); + break; + } + } + } + + if (violations.length > 0) { + throw new Error( + `${tmplFile} has hardcoded 'main' in git commands:\n` + + violations.map(v => ` ${v}`).join('\n') + ); + } + }); + } +}); + // --- Part 7b: TODOS-format.md reference consistency --- describe('TODOS-format.md reference consistency', () => { From 78e519e3b763680ba483aa606d7e2cfbadb1952f Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Mon, 16 Mar 2026 11:28:58 -0500 Subject: [PATCH 02/22] feat: await support in browse js/eval + contributor mode v2 (#104) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat: support await in $B js and eval commands Auto-wrap await expressions in async IIFE context so $B js "await fetch(...)" works without SyntaxError. - hasAwait() strips comments before detection - js: expression wrapping (async()=>(expr))() - eval: smart wrapping — single-line=expression, multi-line=block - 6 new unit tests covering async, false-positive, and return semantics * feat: redesign contributor mode — periodic reflection with 0-10 rating Replace passive "report when things break" with active reflection: - Rate gstack experience 0-10 at workflow step boundaries - Historical calibration example (await bug) anchors the reporting bar - "What would make this a 10" field focuses on actionable improvements - Removed category lists in favor of judgment-based assessment * test: add deterministic contributor mode preamble validation 40 new skill-validation tests (4 checks × 10 skills) verify: - 0-10 rating scale present - Calibration example present - "What would make this a 10" field present - Periodic reflection (not per-command) Update existing E2E contributor eval for new report format. * chore: bump version and changelog (v0.4.2) Co-Authored-By: Claude Opus 4.6 * fix: improve contributor mode + qa-quick E2E reliability Contributor mode: - Add "do not truncate" directive to template — agent was stopping after "My rating" without completing Steps/Raw output/What would make this a 10 sections - Restore assertions for Steps to reproduce and Date footer QA quick: - Make test server URL prominent: top of prompt, explicit "already running" and "do NOT discover ports" instructions - Bump session timeout 180s→240s and test timeout 240s→300s - Set B= at top of prompt (was buried in prose) Co-Authored-By: Claude Opus 4.6 * fix: use flexible assertions for contributor mode E2E Agent writes thorough reports with creative section names ("Repro Steps" vs "Steps to reproduce"). Match intent not formatting: - /repro|steps to reproduce/ for reproduction steps - /date.*2026/ for date footer presence Co-Authored-By: Claude Opus 4.6 * docs: add E2E eval failure blame protocol "Not related to our changes" is an extraordinary claim that requires extraordinary proof. When evals fail during /ship: 1. Run the same eval on main — prove it fails there too 2. If it passes on main, it IS your change — trace the blame 3. If you can't verify, say "unverified" not "pre-existing" Added to CLAUDE.md and as a comment in skill-e2e.test.ts. Co-Authored-By: Claude Opus 4.6 * docs: update CONTRIBUTING.md and BROWSER.md for v0.4.2 CONTRIBUTING.md: update contributor mode description — now describes periodic 0-10 reflection loop instead of passive friction detection. BROWSER.md: add js/eval async documentation — await expressions are auto-wrapped in async context, single-line eval returns values directly. Co-Authored-By: Claude Opus 4.6 * fix: restore v0.4.2 changelog entries lost during cherry-pick conflict The base branch detection entries from main were dropped when resolving the CHANGELOG conflict — should have merged both sets, not replaced. Co-Authored-By: Claude Opus 4.6 --------- Co-authored-by: Claude Opus 4.6 --- BROWSER.md | 12 ++++++++ CHANGELOG.md | 6 ++++ CLAUDE.md | 15 ++++++++++ CONTRIBUTING.md | 10 ++++--- SKILL.md | 24 +++++++++------ browse/SKILL.md | 24 +++++++++------ browse/src/read-commands.ts | 16 +++++++++- browse/test/commands.test.ts | 54 ++++++++++++++++++++++++++++++++++ plan-ceo-review/SKILL.md | 24 +++++++++------ plan-eng-review/SKILL.md | 24 +++++++++------ qa-only/SKILL.md | 24 +++++++++------ qa/SKILL.md | 24 +++++++++------ retro/SKILL.md | 24 +++++++++------ review/SKILL.md | 24 +++++++++------ scripts/gen-skill-docs.ts | 24 +++++++++------ setup-browser-cookies/SKILL.md | 24 +++++++++------ ship/SKILL.md | 24 +++++++++------ test/skill-e2e.test.ts | 21 +++++++++++-- test/skill-validation.test.ts | 38 ++++++++++++++++++++++++ 19 files changed, 329 insertions(+), 107 deletions(-) diff --git a/BROWSER.md b/BROWSER.md index 2d828ebe..df4a6d1d 100644 --- a/BROWSER.md +++ b/BROWSER.md @@ -127,6 +127,18 @@ The `console`, `network`, and `dialog` commands read from the in-memory buffers, Dialogs (alert, confirm, prompt) are auto-accepted by default to prevent browser lockup. The `dialog-accept` and `dialog-dismiss` commands control this behavior. For prompts, `dialog-accept ` provides the response text. All dialogs are logged to the dialog buffer with type, message, and action taken. +### JavaScript execution (`js` and `eval`) + +`js` runs a single expression, `eval` runs a JS file. Both support `await` — expressions containing `await` are automatically wrapped in an async context: + +```bash +$B js "await fetch('/api/data').then(r => r.json())" # works +$B js "document.title" # also works (no wrapping needed) +$B eval my-script.js # file with await works too +``` + +For `eval` files, single-line files return the expression value directly. Multi-line files need explicit `return` when using `await`. Comments containing "await" don't trigger wrapping. + ### Multi-workspace support Each workspace gets its own isolated browser instance with its own Chromium process, tabs, cookies, and logs. State is stored in `.gstack/` inside the project root (detected via `git rev-parse --show-toplevel`). diff --git a/CHANGELOG.md b/CHANGELOG.md index 3d67a917..9b4e93f9 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,8 @@ ## 0.4.2 — 2026-03-16 +- **`$B js "await fetch(...)"` now just works.** Any `await` expression in `$B js` or `$B eval` is automatically wrapped in an async context. No more `SyntaxError: await is only valid in async functions`. Single-line eval files return values directly; multi-line files use explicit `return`. +- **Contributor mode now reflects, not just reacts.** Instead of only filing reports when something breaks, contributor mode now prompts periodic reflection: "Rate your gstack experience 0-10. Not a 10? Think about why." Catches quality-of-life issues and friction that passive detection misses. Reports now include a 0-10 rating and "What would make this a 10" to focus on actionable improvements. - **Skills now respect your branch target.** `/ship`, `/review`, `/qa`, and `/plan-ceo-review` detect which branch your PR actually targets instead of assuming `main`. Stacked branches, Conductor workspaces targeting feature branches, and repos using `master` all just work now. - **`/retro` works on any default branch.** Repos using `master`, `develop`, or other default branch names are detected automatically — no more empty retros because the branch name was wrong. - **New `{{BASE_BRANCH_DETECT}}` placeholder** for skill authors — drop it into any template and get 3-step branch detection (PR base → repo default → fallback) for free. @@ -9,6 +11,10 @@ ### For contributors +- Added `hasAwait()` helper with comment-stripping to avoid false positives on `// await` in eval files. +- Smart eval wrapping: single-line → expression `(...)`, multi-line → block `{...}` with explicit `return`. +- 6 new async wrapping unit tests, 40 new contributor mode preamble validation tests. +- Calibration example framed as historical ("used to fail") to avoid implying a live bug post-fix. - Added "Writing SKILL templates" section to CLAUDE.md — rules for natural language over bash-isms, dynamic branch detection, self-contained code blocks. - Hardcoded-main regression test scans all `.tmpl` files for git commands with hardcoded `main`. - QA template cleaned up: removed `REPORT_DIR` shell variable, simplified port detection to prose. diff --git a/CLAUDE.md b/CLAUDE.md index bc21f606..6f12deae 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -118,6 +118,21 @@ CHANGELOG.md is **for users**, not contributors. Write it like product release n - No jargon: say "every question now tells you which project and branch you're in" not "AskUserQuestion format standardized across skill templates via preamble resolver." +## E2E eval failure blame protocol + +When an E2E eval fails during `/ship` or any other workflow, **never claim "not +related to our changes" without proving it.** These systems have invisible couplings — +a preamble text change affects agent behavior, a new helper changes timing, a +regenerated SKILL.md shifts prompt context. + +**Required before attributing a failure to "pre-existing":** +1. Run the same eval on main (or base branch) and show it fails there too +2. If it passes on main but fails on the branch — it IS your change. Trace the blame. +3. If you can't run on main, say "unverified — may or may not be related" and flag it + as a risk in the PR body + +"Pre-existing" without receipts is a lazy claim. Prove it or don't say it. + ## Deploying to the active skill The active skill lives at `~/.claude/skills/gstack/`. After making changes: diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index b06b837e..4af2e889 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -22,9 +22,11 @@ bin/dev-teardown # deactivate — back to your global install ## Contributor mode -Contributor mode is for people who want to fix gstack when it annoys them. Enable it -and Claude Code will automatically log issues to `~/.gstack/contributor-logs/` as you -work — what you were doing, what went wrong, repro steps, raw output. +Contributor mode turns gstack into a self-improving tool. Enable it and Claude Code +will periodically reflect on its gstack experience — rating it 0-10 at the end of +each major workflow step. When something isn't a 10, it thinks about why and files +a report to `~/.gstack/contributor-logs/` with what happened, repro steps, and what +would make it better. ```bash ~/.claude/skills/gstack/bin/gstack-config set gstack_contributor true @@ -36,7 +38,7 @@ the issue, fix it, and open a PR. ### The contributor workflow -1. **Hit friction while using gstack** — contributor mode logs it automatically +1. **Use gstack normally** — contributor mode reflects and logs issues automatically 2. **Check your logs:** `ls ~/.gstack/contributor-logs/` 3. **Fork and clone gstack** (if you haven't already) 4. **Symlink your fork into the project where you hit the bug:** diff --git a/SKILL.md b/SKILL.md index b362e824..2239a91b 100644 --- a/SKILL.md +++ b/SKILL.md @@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" # gstack browse: QA Testing & Dogfooding diff --git a/browse/SKILL.md b/browse/SKILL.md index 28e976df..c0d7a4eb 100644 --- a/browse/SKILL.md +++ b/browse/SKILL.md @@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" # browse: QA Testing & Dogfooding diff --git a/browse/src/read-commands.ts b/browse/src/read-commands.ts index 53efec8a..a7d76352 100644 --- a/browse/src/read-commands.ts +++ b/browse/src/read-commands.ts @@ -11,6 +11,12 @@ import type { Page } from 'playwright'; import * as fs from 'fs'; import * as path from 'path'; +/** Detect await keyword, ignoring comments. Accepted risk: await in string literals triggers wrapping (harmless). */ +function hasAwait(code: string): boolean { + const stripped = code.replace(/\/\/.*$/gm, '').replace(/\/\*[\s\S]*?\*\//g, ''); + return /\bawait\b/.test(stripped); +} + // Security: Path validation to prevent path traversal attacks const SAFE_DIRECTORIES = ['/tmp', process.cwd()]; @@ -118,7 +124,8 @@ export async function handleReadCommand( case 'js': { const expr = args[0]; if (!expr) throw new Error('Usage: browse js '); - const result = await page.evaluate(expr); + const wrapped = hasAwait(expr) ? `(async()=>(${expr}))()` : expr; + const result = await page.evaluate(wrapped); return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? ''); } @@ -128,6 +135,13 @@ export async function handleReadCommand( validateReadPath(filePath); if (!fs.existsSync(filePath)) throw new Error(`File not found: ${filePath}`); const code = fs.readFileSync(filePath, 'utf-8'); + if (hasAwait(code)) { + const trimmed = code.trim(); + const isSingleExpr = trimmed.split('\n').length === 1; + const wrapped = isSingleExpr ? `(async()=>(${trimmed}))()` : `(async()=>{\n${code}\n})()`; + const result = await page.evaluate(wrapped); + return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? ''); + } const result = await page.evaluate(code); return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? ''); } diff --git a/browse/test/commands.test.ts b/browse/test/commands.test.ts index a3e201d9..d8aaeab6 100644 --- a/browse/test/commands.test.ts +++ b/browse/test/commands.test.ts @@ -144,6 +144,60 @@ describe('Inspection', () => { expect(obj.b).toBe(2); }); + test('js supports await expressions', async () => { + const result = await handleReadCommand('js', ['await Promise.resolve(42)'], bm); + expect(result).toBe('42'); + }); + + test('js does not false-positive on await substring', async () => { + const result = await handleReadCommand('js', ['(() => { const awaitable = 5; return awaitable })()'], bm); + expect(result).toBe('5'); + }); + + test('eval supports await in single-line file', async () => { + const tmp = '/tmp/eval-await-test.js'; + fs.writeFileSync(tmp, 'await Promise.resolve("hello from eval")'); + try { + const result = await handleReadCommand('eval', [tmp], bm); + expect(result).toBe('hello from eval'); + } finally { + fs.unlinkSync(tmp); + } + }); + + test('eval does not wrap when await is only in a comment', async () => { + const tmp = '/tmp/eval-comment-test.js'; + fs.writeFileSync(tmp, '// no need to await this\ndocument.title'); + try { + const result = await handleReadCommand('eval', [tmp], bm); + expect(result).toBe('Test Page - Basic'); + } finally { + fs.unlinkSync(tmp); + } + }); + + test('eval multi-line with await and explicit return', async () => { + const tmp = '/tmp/eval-multiline-await.js'; + fs.writeFileSync(tmp, 'const data = await Promise.resolve("multi");\nreturn data;'); + try { + const result = await handleReadCommand('eval', [tmp], bm); + expect(result).toBe('multi'); + } finally { + fs.unlinkSync(tmp); + } + }); + + test('eval multi-line with await but no return gives empty string', async () => { + const tmp = '/tmp/eval-multiline-no-return.js'; + fs.writeFileSync(tmp, 'const data = await Promise.resolve("lost");\ndata;'); + try { + const result = await handleReadCommand('eval', [tmp], bm); + expect(result).toBe(''); + } finally { + fs.unlinkSync(tmp); + } + }); + test('css returns computed property', async () => { const result = await handleReadCommand('css', ['h1', 'color'], bm); // Navy color diff --git a/plan-ceo-review/SKILL.md b/plan-ceo-review/SKILL.md index 77ca6438..07830998 100644 --- a/plan-ceo-review/SKILL.md +++ b/plan-ceo-review/SKILL.md @@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Step 0: Detect base branch diff --git a/plan-eng-review/SKILL.md b/plan-eng-review/SKILL.md index 819ef072..ad2baca6 100644 --- a/plan-eng-review/SKILL.md +++ b/plan-eng-review/SKILL.md @@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" # Plan Review Mode diff --git a/qa-only/SKILL.md b/qa-only/SKILL.md index 438b7826..27d939be 100644 --- a/qa-only/SKILL.md +++ b/qa-only/SKILL.md @@ -43,12 +43,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -57,20 +60,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" # /qa-only: Report-Only QA Testing diff --git a/qa/SKILL.md b/qa/SKILL.md index 5ea4643a..938bf10b 100644 --- a/qa/SKILL.md +++ b/qa/SKILL.md @@ -48,12 +48,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -62,20 +65,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Step 0: Detect base branch diff --git a/retro/SKILL.md b/retro/SKILL.md index ca44f49e..39b7ee13 100644 --- a/retro/SKILL.md +++ b/retro/SKILL.md @@ -43,12 +43,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -57,20 +60,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Detect default branch diff --git a/review/SKILL.md b/review/SKILL.md index 949a0c65..b94f8a30 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Step 0: Detect base branch diff --git a/scripts/gen-skill-docs.ts b/scripts/gen-skill-docs.ts index 9c81e968..f3d93dbb 100644 --- a/scripts/gen-skill-docs.ts +++ b/scripts/gen-skill-docs.ts @@ -123,12 +123,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If \`_CONTRIB\` is \`true\`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If \`_CONTRIB\` is \`true\`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write \`~/.gstack/contributor-logs/{slug}.md\` with this structure: +**Calibration — this is the bar:** For example, \`$B js "await fetch(...)"\` used to fail with \`SyntaxError: await is only valid in async functions\` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write \`~/.gstack/contributor-logs/{slug}.md\` with **all sections below** (do not truncate — include every section through the Date/Version footer): \`\`\` # {Title} @@ -137,20 +140,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +\`\`\` +{paste the actual error or unexpected output here} +\`\`\` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} \`\`\` -Then run: \`mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md\` - -Slug: lowercase, hyphens, max 60 chars (e.g. \`browse-snapshot-ref-gap\`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"`; +Slug: lowercase, hyphens, max 60 chars (e.g. \`browse-js-no-await\`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"`; } function generateBrowseSetup(): string { diff --git a/setup-browser-cookies/SKILL.md b/setup-browser-cookies/SKILL.md index 06230249..d522b270 100644 --- a/setup-browser-cookies/SKILL.md +++ b/setup-browser-cookies/SKILL.md @@ -41,12 +41,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -55,20 +58,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" # Setup Browser Cookies diff --git a/ship/SKILL.md b/ship/SKILL.md index 47a9da11..7791f4bc 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -43,12 +43,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli ## Contributor Mode -If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. -**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. -**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site. +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! -**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): ``` # {Title} @@ -57,20 +60,23 @@ Hey gstack team — ran into this while using /{skill-name}: **What I was trying to do:** {what the user/agent was attempting} **What happened instead:** {what actually happened} -**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} ## Steps to reproduce 1. {step} ## Raw output -(wrap any error messages or unexpected output in a markdown code block) +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} ``` -Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` - -Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" ## Step 0: Detect base branch diff --git a/test/skill-e2e.test.ts b/test/skill-e2e.test.ts index 4978ce53..aa50a976 100644 --- a/test/skill-e2e.test.ts +++ b/test/skill-e2e.test.ts @@ -13,6 +13,11 @@ import * as os from 'os'; const ROOT = path.resolve(import.meta.dir, '..'); // Skip unless EVALS=1. Session runner strips CLAUDE* env vars to avoid nested session issues. +// +// BLAME PROTOCOL: When an eval fails, do NOT claim "pre-existing" or "not related +// to our changes" without proof. Run the same eval on main to verify. These tests +// have invisible couplings — preamble text, SKILL.md content, and timing all affect +// agent behavior. See CLAUDE.md "E2E eval failure blame protocol" for details. const evalsEnabled = !!process.env.EVALS; const describeE2E = evalsEnabled ? describe : describe.skip; @@ -322,10 +327,16 @@ File a contributor report about this issue. Then tell me what you filed.`, const logFiles = fs.readdirSync(logsDir).filter(f => f.endsWith('.md')); expect(logFiles.length).toBeGreaterThan(0); + // Verify new reflection-based format const logContent = fs.readFileSync(path.join(logsDir, logFiles[0]), 'utf-8'); expect(logContent).toContain('Hey gstack team'); expect(logContent).toContain('What I was trying to do'); expect(logContent).toContain('What happened instead'); + expect(logContent).toMatch(/rating/i); + // Verify report has repro steps (agent may use "Steps to reproduce", "Repro Steps", etc.) + expect(logContent).toMatch(/repro|steps to reproduce|how to reproduce/i); + // Verify report has date/version footer (agent may format differently) + expect(logContent).toMatch(/date.*2026|2026.*date/i); // Clean up try { fs.rmSync(contribDir, { recursive: true, force: true }); } catch {} @@ -424,16 +435,20 @@ describeE2E('QA skill E2E', () => { test('/qa quick completes without browse errors', async () => { const result = await runSkillTest({ - prompt: `You have a browse binary at ${browseBin}. Assign it to B variable like: B="${browseBin}" + prompt: `B="${browseBin}" + +The test server is already running at: ${testServer.url} +Target page: ${testServer.url}/basic.html Read the file qa/SKILL.md for the QA workflow instructions. Run a Quick-depth QA test on ${testServer.url}/basic.html Do NOT use AskUserQuestion — run Quick tier directly. +Do NOT try to start a server or discover ports — the URL above is ready. Write your report to ${qaDir}/qa-reports/qa-report.md`, workingDirectory: qaDir, maxTurns: 35, - timeout: 180_000, + timeout: 240_000, testName: 'qa-quick', runId, }); @@ -448,7 +463,7 @@ Write your report to ${qaDir}/qa-reports/qa-report.md`, } // Accept error_max_turns — the agent doing thorough QA work is not a failure expect(['success', 'error_max_turns']).toContain(result.exitReason); - }, 240_000); + }, 300_000); }); // --- B5: Review skill E2E --- diff --git a/test/skill-validation.test.ts b/test/skill-validation.test.ts index 2a947b15..3ff5c356 100644 --- a/test/skill-validation.test.ts +++ b/test/skill-validation.test.ts @@ -496,6 +496,44 @@ describe('v0.4.1 preamble features', () => { } }); +// --- Contributor mode preamble structure validation --- + +describe('Contributor mode preamble structure', () => { + const skillsWithPreamble = [ + 'SKILL.md', 'browse/SKILL.md', 'qa/SKILL.md', + 'qa-only/SKILL.md', + 'setup-browser-cookies/SKILL.md', + 'ship/SKILL.md', 'review/SKILL.md', + 'plan-ceo-review/SKILL.md', 'plan-eng-review/SKILL.md', + 'retro/SKILL.md', + ]; + + for (const skill of skillsWithPreamble) { + test(`${skill} has 0-10 rating in contributor mode`, () => { + const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8'); + expect(content).toContain('0 to 10'); + expect(content).toContain('My rating'); + }); + + test(`${skill} has calibration example`, () => { + const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8'); + expect(content).toContain('Calibration'); + expect(content).toContain('the bar'); + }); + + test(`${skill} has "what would make this a 10" field`, () => { + const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8'); + expect(content).toContain('What would make this a 10'); + }); + + test(`${skill} uses periodic reflection (not per-command)`, () => { + const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8'); + expect(content).toContain('workflow step'); + expect(content).not.toContain('After you use gstack-provided CLIs'); + }); + } +}); + describe('Enum & Value Completeness in review checklist', () => { const checklist = fs.readFileSync(path.join(ROOT, 'review', 'checklist.md'), 'utf-8'); From 276d0cc6cb944f295377675f7dff57b916d2c29f Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Mon, 16 Mar 2026 12:27:47 -0500 Subject: [PATCH 03/22] feat: always-on ELI16 + branch detection (v0.4.3) (#108) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat: always-on ELI16 + branch detection in preamble - Add _BRANCH detection to preamble bash block (git branch --show-current) - Merge ELI16 rules into default AskUserQuestion format (always-on) - Remove _SESSIONS >= 3 conditional — better questions always - Add simplification rules: plain English, no jargon, no raw function names - Update tests for branch detection and simplification regression guard * chore: bump version and changelog (v0.4.3) Co-Authored-By: Claude Opus 4.6 --------- Co-authored-by: Claude Opus 4.6 --- CHANGELOG.md | 11 +++++++++++ SKILL.md | 12 +++++++----- VERSION | 2 +- browse/SKILL.md | 12 +++++++----- plan-ceo-review/SKILL.md | 12 +++++++----- plan-eng-review/SKILL.md | 12 +++++++----- qa-only/SKILL.md | 12 +++++++----- qa/SKILL.md | 12 +++++++----- retro/SKILL.md | 12 +++++++----- review/SKILL.md | 12 +++++++----- scripts/gen-skill-docs.ts | 12 +++++++----- setup-browser-cookies/SKILL.md | 12 +++++++----- ship/SKILL.md | 12 +++++++----- test/gen-skill-docs.test.ts | 13 ++++++++++++- test/skill-validation.test.ts | 2 +- 15 files changed, 102 insertions(+), 58 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9b4e93f9..99b7086c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,16 @@ # Changelog +## 0.4.3 — 2026-03-16 + +- **Every question is now crystal clear, every time.** You used to need 3+ sessions running before gstack would give you full context and plain English explanations. Now every question — even in a single session — tells you the project, branch, and what's happening, explained simply enough to understand mid-context-switch. No more "sorry, explain it to me more simply." +- **Branch name is always correct.** gstack now detects your current branch at runtime instead of relying on the snapshot from when the conversation started. Switch branches mid-session? gstack keeps up. + +### For contributors + +- Merged ELI16 rules into base AskUserQuestion format — one format instead of two, no `_SESSIONS >= 3` conditional. +- Added `_BRANCH` detection to preamble bash block (`git branch --show-current` with fallback). +- Added regression guard tests for branch detection and simplification rules. + ## 0.4.2 — 2026-03-16 - **`$B js "await fetch(...)"` now just works.** Any `await` expression in `$B js` or `$B eval` is automatically wrapped in an async context. No more `SyntaxError: await is only valid in async functions`. Single-line eval files return values directly; multi-line files use explicit `return`. diff --git a/SKILL.md b/SKILL.md index 2239a91b..76007eee 100644 --- a/SKILL.md +++ b/SKILL.md @@ -26,6 +26,8 @@ touch ~/.gstack/sessions/"$PPID" _SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true) +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" ``` If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If `JUST_UPGRADED `: tell user "Running gstack v{to} (just updated!)" and continue. @@ -33,12 +35,12 @@ If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/g ## AskUserQuestion Format **ALWAYS follow this structure for every AskUserQuestion call:** -1. Context: project name, current branch, what we're working on (1-2 sentences) -2. The specific question or decision point -3. `RECOMMENDATION: Choose [X] because [one-line reason]` -4. Lettered options: `A) ... B) ... C) ...` +1. **Re-ground:** State the project, the current branch (use the `_BRANCH` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences) +2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called. +3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]` +4. **Options:** Lettered options: `A) ... B) ... C) ...` -If `_SESSIONS` is 3 or more: the user is juggling multiple gstack sessions and context-switching heavily. **ELI16 mode** — they may not remember what this conversation is about. Every AskUserQuestion MUST re-ground them: state the project, the branch, the current plan/task, then the specific problem, THEN the recommendation and options. Be extra clear and self-contained — assume they haven't looked at this window in 20 minutes. +Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex. Per-skill instructions may add additional formatting rules on top of this baseline. diff --git a/VERSION b/VERSION index 2b7c5ae0..17b2ccd9 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -0.4.2 +0.4.3 diff --git a/browse/SKILL.md b/browse/SKILL.md index c0d7a4eb..dd473dc0 100644 --- a/browse/SKILL.md +++ b/browse/SKILL.md @@ -26,6 +26,8 @@ touch ~/.gstack/sessions/"$PPID" _SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true) +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" ``` If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If `JUST_UPGRADED `: tell user "Running gstack v{to} (just updated!)" and continue. @@ -33,12 +35,12 @@ If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/g ## AskUserQuestion Format **ALWAYS follow this structure for every AskUserQuestion call:** -1. Context: project name, current branch, what we're working on (1-2 sentences) -2. The specific question or decision point -3. `RECOMMENDATION: Choose [X] because [one-line reason]` -4. Lettered options: `A) ... B) ... C) ...` +1. **Re-ground:** State the project, the current branch (use the `_BRANCH` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences) +2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called. +3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]` +4. **Options:** Lettered options: `A) ... B) ... C) ...` -If `_SESSIONS` is 3 or more: the user is juggling multiple gstack sessions and context-switching heavily. **ELI16 mode** — they may not remember what this conversation is about. Every AskUserQuestion MUST re-ground them: state the project, the branch, the current plan/task, then the specific problem, THEN the recommendation and options. Be extra clear and self-contained — assume they haven't looked at this window in 20 minutes. +Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex. Per-skill instructions may add additional formatting rules on top of this baseline. diff --git a/plan-ceo-review/SKILL.md b/plan-ceo-review/SKILL.md index 07830998..b05db5e9 100644 --- a/plan-ceo-review/SKILL.md +++ b/plan-ceo-review/SKILL.md @@ -26,6 +26,8 @@ touch ~/.gstack/sessions/"$PPID" _SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true) +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" ``` If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If `JUST_UPGRADED `: tell user "Running gstack v{to} (just updated!)" and continue. @@ -33,12 +35,12 @@ If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/g ## AskUserQuestion Format **ALWAYS follow this structure for every AskUserQuestion call:** -1. Context: project name, current branch, what we're working on (1-2 sentences) -2. The specific question or decision point -3. `RECOMMENDATION: Choose [X] because [one-line reason]` -4. Lettered options: `A) ... B) ... C) ...` +1. **Re-ground:** State the project, the current branch (use the `_BRANCH` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences) +2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called. +3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]` +4. **Options:** Lettered options: `A) ... B) ... C) ...` -If `_SESSIONS` is 3 or more: the user is juggling multiple gstack sessions and context-switching heavily. **ELI16 mode** — they may not remember what this conversation is about. Every AskUserQuestion MUST re-ground them: state the project, the branch, the current plan/task, then the specific problem, THEN the recommendation and options. Be extra clear and self-contained — assume they haven't looked at this window in 20 minutes. +Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex. Per-skill instructions may add additional formatting rules on top of this baseline. diff --git a/plan-eng-review/SKILL.md b/plan-eng-review/SKILL.md index ad2baca6..4cd8f481 100644 --- a/plan-eng-review/SKILL.md +++ b/plan-eng-review/SKILL.md @@ -26,6 +26,8 @@ touch ~/.gstack/sessions/"$PPID" _SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true) +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" ``` If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If `JUST_UPGRADED `: tell user "Running gstack v{to} (just updated!)" and continue. @@ -33,12 +35,12 @@ If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/g ## AskUserQuestion Format **ALWAYS follow this structure for every AskUserQuestion call:** -1. Context: project name, current branch, what we're working on (1-2 sentences) -2. The specific question or decision point -3. `RECOMMENDATION: Choose [X] because [one-line reason]` -4. Lettered options: `A) ... B) ... C) ...` +1. **Re-ground:** State the project, the current branch (use the `_BRANCH` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences) +2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called. +3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]` +4. **Options:** Lettered options: `A) ... B) ... C) ...` -If `_SESSIONS` is 3 or more: the user is juggling multiple gstack sessions and context-switching heavily. **ELI16 mode** — they may not remember what this conversation is about. Every AskUserQuestion MUST re-ground them: state the project, the branch, the current plan/task, then the specific problem, THEN the recommendation and options. Be extra clear and self-contained — assume they haven't looked at this window in 20 minutes. +Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex. Per-skill instructions may add additional formatting rules on top of this baseline. diff --git a/qa-only/SKILL.md b/qa-only/SKILL.md index 27d939be..5e74f03e 100644 --- a/qa-only/SKILL.md +++ b/qa-only/SKILL.md @@ -25,6 +25,8 @@ touch ~/.gstack/sessions/"$PPID" _SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true) +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" ``` If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If `JUST_UPGRADED `: tell user "Running gstack v{to} (just updated!)" and continue. @@ -32,12 +34,12 @@ If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/g ## AskUserQuestion Format **ALWAYS follow this structure for every AskUserQuestion call:** -1. Context: project name, current branch, what we're working on (1-2 sentences) -2. The specific question or decision point -3. `RECOMMENDATION: Choose [X] because [one-line reason]` -4. Lettered options: `A) ... B) ... C) ...` +1. **Re-ground:** State the project, the current branch (use the `_BRANCH` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences) +2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called. +3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]` +4. **Options:** Lettered options: `A) ... B) ... C) ...` -If `_SESSIONS` is 3 or more: the user is juggling multiple gstack sessions and context-switching heavily. **ELI16 mode** — they may not remember what this conversation is about. Every AskUserQuestion MUST re-ground them: state the project, the branch, the current plan/task, then the specific problem, THEN the recommendation and options. Be extra clear and self-contained — assume they haven't looked at this window in 20 minutes. +Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex. Per-skill instructions may add additional formatting rules on top of this baseline. diff --git a/qa/SKILL.md b/qa/SKILL.md index 938bf10b..6b078a9d 100644 --- a/qa/SKILL.md +++ b/qa/SKILL.md @@ -30,6 +30,8 @@ touch ~/.gstack/sessions/"$PPID" _SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true) +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" ``` If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If `JUST_UPGRADED `: tell user "Running gstack v{to} (just updated!)" and continue. @@ -37,12 +39,12 @@ If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/g ## AskUserQuestion Format **ALWAYS follow this structure for every AskUserQuestion call:** -1. Context: project name, current branch, what we're working on (1-2 sentences) -2. The specific question or decision point -3. `RECOMMENDATION: Choose [X] because [one-line reason]` -4. Lettered options: `A) ... B) ... C) ...` +1. **Re-ground:** State the project, the current branch (use the `_BRANCH` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences) +2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called. +3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]` +4. **Options:** Lettered options: `A) ... B) ... C) ...` -If `_SESSIONS` is 3 or more: the user is juggling multiple gstack sessions and context-switching heavily. **ELI16 mode** — they may not remember what this conversation is about. Every AskUserQuestion MUST re-ground them: state the project, the branch, the current plan/task, then the specific problem, THEN the recommendation and options. Be extra clear and self-contained — assume they haven't looked at this window in 20 minutes. +Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex. Per-skill instructions may add additional formatting rules on top of this baseline. diff --git a/retro/SKILL.md b/retro/SKILL.md index 39b7ee13..c7781525 100644 --- a/retro/SKILL.md +++ b/retro/SKILL.md @@ -25,6 +25,8 @@ touch ~/.gstack/sessions/"$PPID" _SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true) +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" ``` If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If `JUST_UPGRADED `: tell user "Running gstack v{to} (just updated!)" and continue. @@ -32,12 +34,12 @@ If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/g ## AskUserQuestion Format **ALWAYS follow this structure for every AskUserQuestion call:** -1. Context: project name, current branch, what we're working on (1-2 sentences) -2. The specific question or decision point -3. `RECOMMENDATION: Choose [X] because [one-line reason]` -4. Lettered options: `A) ... B) ... C) ...` +1. **Re-ground:** State the project, the current branch (use the `_BRANCH` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences) +2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called. +3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]` +4. **Options:** Lettered options: `A) ... B) ... C) ...` -If `_SESSIONS` is 3 or more: the user is juggling multiple gstack sessions and context-switching heavily. **ELI16 mode** — they may not remember what this conversation is about. Every AskUserQuestion MUST re-ground them: state the project, the branch, the current plan/task, then the specific problem, THEN the recommendation and options. Be extra clear and self-contained — assume they haven't looked at this window in 20 minutes. +Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex. Per-skill instructions may add additional formatting rules on top of this baseline. diff --git a/review/SKILL.md b/review/SKILL.md index b94f8a30..285a4922 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -26,6 +26,8 @@ touch ~/.gstack/sessions/"$PPID" _SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true) +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" ``` If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If `JUST_UPGRADED `: tell user "Running gstack v{to} (just updated!)" and continue. @@ -33,12 +35,12 @@ If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/g ## AskUserQuestion Format **ALWAYS follow this structure for every AskUserQuestion call:** -1. Context: project name, current branch, what we're working on (1-2 sentences) -2. The specific question or decision point -3. `RECOMMENDATION: Choose [X] because [one-line reason]` -4. Lettered options: `A) ... B) ... C) ...` +1. **Re-ground:** State the project, the current branch (use the `_BRANCH` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences) +2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called. +3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]` +4. **Options:** Lettered options: `A) ... B) ... C) ...` -If `_SESSIONS` is 3 or more: the user is juggling multiple gstack sessions and context-switching heavily. **ELI16 mode** — they may not remember what this conversation is about. Every AskUserQuestion MUST re-ground them: state the project, the branch, the current plan/task, then the specific problem, THEN the recommendation and options. Be extra clear and self-contained — assume they haven't looked at this window in 20 minutes. +Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex. Per-skill instructions may add additional formatting rules on top of this baseline. diff --git a/scripts/gen-skill-docs.ts b/scripts/gen-skill-docs.ts index f3d93dbb..9d79189a 100644 --- a/scripts/gen-skill-docs.ts +++ b/scripts/gen-skill-docs.ts @@ -105,6 +105,8 @@ touch ~/.gstack/sessions/"$PPID" _SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true) +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" \`\`\` If output shows \`UPGRADE_AVAILABLE \`: read \`~/.claude/skills/gstack/gstack-upgrade/SKILL.md\` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If \`JUST_UPGRADED \`: tell user "Running gstack v{to} (just updated!)" and continue. @@ -112,12 +114,12 @@ If output shows \`UPGRADE_AVAILABLE \`: read \`~/.claude/skills/gstac ## AskUserQuestion Format **ALWAYS follow this structure for every AskUserQuestion call:** -1. Context: project name, current branch, what we're working on (1-2 sentences) -2. The specific question or decision point -3. \`RECOMMENDATION: Choose [X] because [one-line reason]\` -4. Lettered options: \`A) ... B) ... C) ...\` +1. **Re-ground:** State the project, the current branch (use the \`_BRANCH\` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences) +2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called. +3. **Recommend:** \`RECOMMENDATION: Choose [X] because [one-line reason]\` +4. **Options:** Lettered options: \`A) ... B) ... C) ...\` -If \`_SESSIONS\` is 3 or more: the user is juggling multiple gstack sessions and context-switching heavily. **ELI16 mode** — they may not remember what this conversation is about. Every AskUserQuestion MUST re-ground them: state the project, the branch, the current plan/task, then the specific problem, THEN the recommendation and options. Be extra clear and self-contained — assume they haven't looked at this window in 20 minutes. +Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex. Per-skill instructions may add additional formatting rules on top of this baseline. diff --git a/setup-browser-cookies/SKILL.md b/setup-browser-cookies/SKILL.md index d522b270..faebe520 100644 --- a/setup-browser-cookies/SKILL.md +++ b/setup-browser-cookies/SKILL.md @@ -23,6 +23,8 @@ touch ~/.gstack/sessions/"$PPID" _SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true) +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" ``` If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If `JUST_UPGRADED `: tell user "Running gstack v{to} (just updated!)" and continue. @@ -30,12 +32,12 @@ If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/g ## AskUserQuestion Format **ALWAYS follow this structure for every AskUserQuestion call:** -1. Context: project name, current branch, what we're working on (1-2 sentences) -2. The specific question or decision point -3. `RECOMMENDATION: Choose [X] because [one-line reason]` -4. Lettered options: `A) ... B) ... C) ...` +1. **Re-ground:** State the project, the current branch (use the `_BRANCH` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences) +2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called. +3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]` +4. **Options:** Lettered options: `A) ... B) ... C) ...` -If `_SESSIONS` is 3 or more: the user is juggling multiple gstack sessions and context-switching heavily. **ELI16 mode** — they may not remember what this conversation is about. Every AskUserQuestion MUST re-ground them: state the project, the branch, the current plan/task, then the specific problem, THEN the recommendation and options. Be extra clear and self-contained — assume they haven't looked at this window in 20 minutes. +Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex. Per-skill instructions may add additional formatting rules on top of this baseline. diff --git a/ship/SKILL.md b/ship/SKILL.md index 7791f4bc..ee98ecaf 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -25,6 +25,8 @@ touch ~/.gstack/sessions/"$PPID" _SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true) +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" ``` If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If `JUST_UPGRADED `: tell user "Running gstack v{to} (just updated!)" and continue. @@ -32,12 +34,12 @@ If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/g ## AskUserQuestion Format **ALWAYS follow this structure for every AskUserQuestion call:** -1. Context: project name, current branch, what we're working on (1-2 sentences) -2. The specific question or decision point -3. `RECOMMENDATION: Choose [X] because [one-line reason]` -4. Lettered options: `A) ... B) ... C) ...` +1. **Re-ground:** State the project, the current branch (use the `_BRANCH` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences) +2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called. +3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]` +4. **Options:** Lettered options: `A) ... B) ... C) ...` -If `_SESSIONS` is 3 or more: the user is juggling multiple gstack sessions and context-switching heavily. **ELI16 mode** — they may not remember what this conversation is about. Every AskUserQuestion MUST re-ground them: state the project, the branch, the current plan/task, then the specific problem, THEN the recommendation and options. Be extra clear and self-contained — assume they haven't looked at this window in 20 minutes. +Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex. Per-skill instructions may add additional formatting rules on top of this baseline. diff --git a/test/gen-skill-docs.test.ts b/test/gen-skill-docs.test.ts index e77989f0..7ec8b280 100644 --- a/test/gen-skill-docs.test.ts +++ b/test/gen-skill-docs.test.ts @@ -144,7 +144,18 @@ describe('gen-skill-docs', () => { const content = fs.readFileSync(path.join(ROOT, 'SKILL.md'), 'utf-8'); expect(content).toContain('_SESSIONS'); expect(content).toContain('RECOMMENDATION'); - expect(content).toContain('ELI16'); + }); + + test('generated SKILL.md contains branch detection', () => { + const content = fs.readFileSync(path.join(ROOT, 'SKILL.md'), 'utf-8'); + expect(content).toContain('_BRANCH'); + expect(content).toContain('git branch --show-current'); + }); + + test('generated SKILL.md contains ELI16 simplification rules', () => { + const content = fs.readFileSync(path.join(ROOT, 'SKILL.md'), 'utf-8'); + expect(content).toContain('No raw function names'); + expect(content).toContain('plain English'); }); test('qa and qa-only templates use QA_METHODOLOGY placeholder', () => { diff --git a/test/skill-validation.test.ts b/test/skill-validation.test.ts index 3ff5c356..77f09ee0 100644 --- a/test/skill-validation.test.ts +++ b/test/skill-validation.test.ts @@ -491,7 +491,7 @@ describe('v0.4.1 preamble features', () => { test(`${skill} contains session awareness`, () => { const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8'); expect(content).toContain('_SESSIONS'); - expect(content).toContain('ELI16'); + expect(content).toContain('RECOMMENDATION'); }); } }); From a68244ab57aafd43e83764f603d43291e53aca74 Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Mon, 16 Mar 2026 12:30:22 -0500 Subject: [PATCH 04/22] =?UTF-8?q?feat:=20/document-release=20skill=20?= =?UTF-8?q?=E2=80=94=20post-ship=20doc=20updates=20(v0.4.3)=20(#109)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * docs: update project documentation for v0.4.2 - README: skill count 9→10, added /document-release to skills table, install/uninstall sections, and dedicated section with example - CHANGELOG: added /document-release bullet to v0.4.2 entry Co-Authored-By: Claude Opus 4.6 (1M context) * feat: add /document-release skill with smart VERSION handling New skill runs after /ship but before PR merge. Reads every doc file, cross-references the diff, auto-updates factual changes, asks about risky edits. CHANGELOG clobber protection: never uses Write tool on CHANGELOG.md, only Edit with exact old_string matches. Smart VERSION logic: instead of silently skipping already-bumped versions, compares CHANGELOG entry scope against full diff and asks if significant uncovered changes exist. Also fixes gstack-upgrade/SKILL.md missing from skill-check.ts SKILL_FILES array (existing inconsistency with gen-skill-docs.ts). Co-Authored-By: Claude Opus 4.6 (1M context) * feat: /review Step 5.6 — documentation staleness check Review skill now cross-references code changes against doc files. If a doc describes a feature that changed but the doc wasn't updated, flags it as INFORMATIONAL with a pointer to /document-release. Co-Authored-By: Claude Opus 4.6 (1M context) * test: /document-release E2E with CHANGELOG clobber guard E2E test creates a repo with existing CHANGELOG entries, runs /document-release, and asserts original entries survive. Critical guardrail against the incident where an agent replaced CHANGELOG entries during conflict resolution. Co-Authored-By: Claude Opus 4.6 (1M context) * chore: bump to v0.4.3 — /document-release skill Co-Authored-By: Claude Opus 4.6 (1M context) * chore: regenerate SKILL.md files after merge * chore: regenerate SKILL.md files after merge --------- Co-authored-by: Claude Opus 4.6 (1M context) --- CHANGELOG.md | 1 + CLAUDE.md | 1 + README.md | 32 ++- TODOS.md | 26 ++ document-release/SKILL.md | 437 +++++++++++++++++++++++++++++++++ document-release/SKILL.md.tmpl | 355 ++++++++++++++++++++++++++ review/SKILL.md | 14 ++ review/SKILL.md.tmpl | 14 ++ scripts/gen-skill-docs.ts | 1 + scripts/skill-check.ts | 2 + test/skill-e2e.test.ts | 104 ++++++++ test/skill-validation.test.ts | 3 + 12 files changed, 986 insertions(+), 4 deletions(-) create mode 100644 document-release/SKILL.md create mode 100644 document-release/SKILL.md.tmpl diff --git a/CHANGELOG.md b/CHANGELOG.md index 99b7086c..3c3fcceb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,7 @@ ## 0.4.3 — 2026-03-16 +- **New `/document-release` skill.** Run it after `/ship` but before merging — it reads every doc file in your project, cross-references the diff, and updates README, ARCHITECTURE, CONTRIBUTING, CHANGELOG, and TODOS to match what you actually shipped. Risky changes get surfaced as questions; everything else is automatic. - **Every question is now crystal clear, every time.** You used to need 3+ sessions running before gstack would give you full context and plain English explanations. Now every question — even in a single session — tells you the project, branch, and what's happening, explained simply enough to understand mid-context-switch. No more "sorry, explain it to me more simply." - **Branch name is always correct.** gstack now detects your current branch at runtime instead of relying on the snapshot from when the conversation started. Switch branches mid-session? gstack keeps up. diff --git a/CLAUDE.md b/CLAUDE.md index 6f12deae..85a7cda3 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -48,6 +48,7 @@ gstack/ ├── plan-ceo-review/ # /plan-ceo-review skill ├── plan-eng-review/ # /plan-eng-review skill ├── retro/ # Retrospective skill +├── document-release/ # /document-release skill (post-ship doc updates) ├── setup # One-time setup: build binary + symlink skills ├── SKILL.md # Generated from SKILL.md.tmpl (don't edit directly) ├── SKILL.md.tmpl # Template: edit this, run gen:skill-docs diff --git a/README.md b/README.md index ce994a45..2b87d178 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ **gstack turns Claude Code from one generic assistant into a team of specialists you can summon on demand.** -Nine opinionated workflow skills for [Claude Code](https://docs.anthropic.com/en/docs/claude-code). Plan review, code review, one-command shipping, browser automation, QA testing, and engineering retrospectives — all as slash commands. +Ten opinionated workflow skills for [Claude Code](https://docs.anthropic.com/en/docs/claude-code). Plan review, code review, one-command shipping, browser automation, QA testing, engineering retrospectives, and post-ship documentation — all as slash commands. ### Without gstack @@ -26,6 +26,7 @@ Nine opinionated workflow skills for [Claude Code](https://docs.anthropic.com/en | `/qa-only` | QA reporter | Report-only QA testing. Same methodology as /qa but never fixes anything. Use when you want a pure bug report without code changes. | | `/setup-browser-cookies` | Session manager | Import cookies from your real browser (Comet, Chrome, Arc, Brave, Edge) into the headless session. Test authenticated pages without logging in manually. | | `/retro` | Engineering manager | Team-aware retro: your deep-dive + per-person praise and growth opportunities for every contributor. | +| `/document-release` | Technical writer | Update README, ARCHITECTURE, CONTRIBUTING, and project docs to match what you just shipped. | ## Demo: one feature, five modes @@ -104,11 +105,11 @@ This is the setup I use. One person, ten parallel agents, each with the right co Open Claude Code and paste this. Claude will do the rest. -> Install gstack: run `git clone https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup` then add a "gstack" section to CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, and lists the available skills: /plan-ceo-review, /plan-eng-review, /review, /ship, /browse, /qa, /qa-only, /setup-browser-cookies, /retro. Then ask the user if they also want to add gstack to the current project so teammates get it. +> Install gstack: run `git clone https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup` then add a "gstack" section to CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, and lists the available skills: /plan-ceo-review, /plan-eng-review, /review, /ship, /browse, /qa, /qa-only, /setup-browser-cookies, /retro, /document-release. Then ask the user if they also want to add gstack to the current project so teammates get it. ### Step 2: Add to your repo so teammates get it (optional) -> Add gstack to this project: run `cp -Rf ~/.claude/skills/gstack .claude/skills/gstack && rm -rf .claude/skills/gstack/.git && cd .claude/skills/gstack && ./setup` then add a "gstack" section to this project's CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, lists the available skills: /plan-ceo-review, /plan-eng-review, /review, /ship, /browse, /qa, /setup-browser-cookies, /retro, and tells Claude that if gstack skills aren't working, run `cd .claude/skills/gstack && ./setup` to build the binary and register skills. +> Add gstack to this project: run `cp -Rf ~/.claude/skills/gstack .claude/skills/gstack && rm -rf .claude/skills/gstack/.git && cd .claude/skills/gstack && ./setup` then add a "gstack" section to this project's CLAUDE.md that says to use the /browse skill from gstack for all web browsing, never use mcp\_\_claude-in-chrome\_\_\* tools, lists the available skills: /plan-ceo-review, /plan-eng-review, /review, /ship, /browse, /qa, /setup-browser-cookies, /retro, /document-release, and tells Claude that if gstack skills aren't working, run `cd .claude/skills/gstack && ./setup` to build the binary and register skills. Real files get committed to your repo (not a submodule), so `git clone` just works. The binary and node\_modules are gitignored — teammates just need to run `cd .claude/skills/gstack && ./setup` once to build (or `/browse` handles it automatically on first use). @@ -590,6 +591,29 @@ It saves a JSON snapshot to `.context/retros/` so the next run can show trends. --- +## `/document-release` + +This is my **technical writer mode**. + +After `/ship` creates the PR but before it merges, `/document-release` reads every documentation file in the project and cross-references it against the diff. It updates file paths, command lists, project structure trees, and anything else that drifted. Risky or subjective changes get surfaced as questions — everything else is handled automatically. + +``` +You: /document-release + +Claude: Analyzing 21 files changed across 3 commits. Found 8 documentation files. + + README.md: updated skill count from 9 to 10, added new skill to table + CLAUDE.md: added new directory to project structure + CONTRIBUTING.md: current — no changes needed + TODOS.md: marked 2 items complete, added 1 new item + + All docs updated and committed. PR body updated with doc diff. +``` + +It also polishes CHANGELOG voice (without ever overwriting entries), cleans up completed TODOS, checks cross-doc consistency, and asks about VERSION bumps only when appropriate. + +--- + ## Troubleshooting **Skill not showing up in Claude Code?** @@ -614,7 +638,7 @@ Or set `auto_upgrade: true` in `~/.gstack/config.yaml` to upgrade automatically Paste this into Claude Code: -> Uninstall gstack: remove the skill symlinks by running `for s in browse plan-ceo-review plan-eng-review review ship retro qa qa-only setup-browser-cookies; do rm -f ~/.claude/skills/$s; done` then run `rm -rf ~/.claude/skills/gstack` and remove the gstack section from CLAUDE.md. If this project also has gstack at .claude/skills/gstack, remove it by running `for s in browse plan-ceo-review plan-eng-review review ship retro qa qa-only setup-browser-cookies; do rm -f .claude/skills/$s; done && rm -rf .claude/skills/gstack` and remove the gstack section from the project CLAUDE.md too. +> Uninstall gstack: remove the skill symlinks by running `for s in browse plan-ceo-review plan-eng-review review ship retro qa qa-only setup-browser-cookies document-release; do rm -f ~/.claude/skills/$s; done` then run `rm -rf ~/.claude/skills/gstack` and remove the gstack section from CLAUDE.md. If this project also has gstack at .claude/skills/gstack, remove it by running `for s in browse plan-ceo-review plan-eng-review review ship retro qa qa-only setup-browser-cookies document-release; do rm -f .claude/skills/$s; done && rm -rf .claude/skills/gstack` and remove the gstack section from the project CLAUDE.md too. ## Development diff --git a/TODOS.md b/TODOS.md index 7bd1176a..d2f9282e 100644 --- a/TODOS.md +++ b/TODOS.md @@ -374,6 +374,32 @@ **Priority:** P3 **Depends on:** Ref staleness Parts 1+2 (shipped) +## Document-Release + +### Auto-invoke /document-release from /ship + +**What:** Add Step 8.5 to /ship that reads document-release/SKILL.md and executes the doc update workflow after creating the PR. + +**Why:** Zero-friction doc updates — user runs /ship and docs are automatically current. No extra command to remember. + +**Context:** /ship currently ends at Step 8 (PR URL output). Step 8.5 would continue into the document-release workflow. Same pattern as /ship calling /review's checklist in Step 3.5. + +**Effort:** S +**Priority:** P1 +**Depends on:** /document-release shipped + +### `{{DOC_VOICE}}` shared resolver + +**What:** Create a placeholder resolver in gen-skill-docs.ts encoding the gstack voice guide (friendly, user-forward, lead with benefits). Inject into /ship Step 5, /document-release Step 5, and reference from CLAUDE.md. + +**Why:** DRY — voice rules currently live inline in 3 places (CLAUDE.md CHANGELOG style section, /ship Step 5, /document-release Step 5). When the voice evolves, all three drift. + +**Context:** Same pattern as `{{QA_METHODOLOGY}}` — shared block injected into multiple templates to prevent drift. ~20 lines in gen-skill-docs.ts. + +**Effort:** S +**Priority:** P2 +**Depends on:** None + ## Completed ### Phase 1: Foundations (v0.2.0) diff --git a/document-release/SKILL.md b/document-release/SKILL.md new file mode 100644 index 00000000..694d64da --- /dev/null +++ b/document-release/SKILL.md @@ -0,0 +1,437 @@ +--- +name: document-release +version: 1.0.0 +description: | + Post-ship documentation update. Reads all project docs, cross-references the + diff, updates README/ARCHITECTURE/CONTRIBUTING/CLAUDE.md to match what shipped, + polishes CHANGELOG voice, cleans up TODOS, and optionally bumps VERSION. +allowed-tools: + - Bash + - Read + - Write + - Edit + - Grep + - Glob + - AskUserQuestion +--- + + + +## Preamble (run first) + +```bash +_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true) +[ -n "$_UPD" ] && echo "$_UPD" || true +mkdir -p ~/.gstack/sessions +touch ~/.gstack/sessions/"$PPID" +_SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ') +find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true +_CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true) +_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown") +echo "BRANCH: $_BRANCH" +``` + +If output shows `UPGRADE_AVAILABLE `: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If `JUST_UPGRADED `: tell user "Running gstack v{to} (just updated!)" and continue. + +## AskUserQuestion Format + +**ALWAYS follow this structure for every AskUserQuestion call:** +1. **Re-ground:** State the project, the current branch (use the `_BRANCH` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences) +2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called. +3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]` +4. **Options:** Lettered options: `A) ... B) ... C) ...` + +Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex. + +Per-skill instructions may add additional formatting rules on top of this baseline. + +## Contributor Mode + +If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better. + +**At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better! + +**Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore. + +**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs. + +**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer): + +``` +# {Title} + +Hey gstack team — ran into this while using /{skill-name}: + +**What I was trying to do:** {what the user/agent was attempting} +**What happened instead:** {what actually happened} +**My rating:** {0-10} — {one sentence on why it wasn't a 10} + +## Steps to reproduce +1. {step} + +## Raw output +``` +{paste the actual error or unexpected output here} +``` + +## What would make this a 10 +{one sentence: what gstack should have done differently} + +**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} +``` + +Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}" + +## Step 0: Detect base branch + +Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps. + +1. Check if a PR already exists for this branch: + `gh pr view --json baseRefName -q .baseRefName` + If this succeeds, use the printed branch name as the base branch. + +2. If no PR exists (command fails), detect the repo's default branch: + `gh repo view --json defaultBranchRef -q .defaultBranchRef.name` + +3. If both commands fail, fall back to `main`. + +Print the detected base branch name. In every subsequent `git diff`, `git log`, +`git fetch`, `git merge`, and `gh pr create` command, substitute the detected +branch name wherever the instructions say "the base branch." + +--- + +# Document Release: Post-Ship Documentation Update + +You are running the `/document-release` workflow. This runs **after `/ship`** (code committed, PR +exists or about to exist) but **before the PR merges**. Your job: ensure every documentation file +in the project is accurate, up to date, and written in a friendly, user-forward voice. + +You are mostly automated. Make obvious factual updates directly. Stop and ask only for risky or +subjective decisions. + +**Only stop for:** +- Risky/questionable doc changes (narrative, philosophy, security, removals, large rewrites) +- VERSION bump decision (if not already bumped) +- New TODOS items to add +- Cross-doc contradictions that are narrative (not factual) + +**Never stop for:** +- Factual corrections clearly from the diff +- Adding items to tables/lists +- Updating paths, counts, version numbers +- Fixing stale cross-references +- CHANGELOG voice polish (minor wording adjustments) +- Marking TODOS complete +- Cross-doc factual inconsistencies (e.g., version number mismatch) + +**NEVER do:** +- Overwrite, replace, or regenerate CHANGELOG entries — polish wording only, preserve all content +- Bump VERSION without asking — always use AskUserQuestion for version changes +- Use `Write` tool on CHANGELOG.md — always use `Edit` with exact `old_string` matches + +--- + +## Step 1: Pre-flight & Diff Analysis + +1. Check the current branch. If on the base branch, **abort**: "You're on the base branch. Run from a feature branch." + +2. Gather context about what changed: + +```bash +git diff ...HEAD --stat +``` + +```bash +git log ..HEAD --oneline +``` + +```bash +git diff ...HEAD --name-only +``` + +3. Discover all documentation files in the repo: + +```bash +find . -maxdepth 2 -name "*.md" -not -path "./.git/*" -not -path "./node_modules/*" -not -path "./.gstack/*" -not -path "./.context/*" | sort +``` + +4. Classify the changes into categories relevant to documentation: + - **New features** — new files, new commands, new skills, new capabilities + - **Changed behavior** — modified services, updated APIs, config changes + - **Removed functionality** — deleted files, removed commands + - **Infrastructure** — build system, test infrastructure, CI + +5. Output a brief summary: "Analyzing N files changed across M commits. Found K documentation files to review." + +--- + +## Step 2: Per-File Documentation Audit + +Read each documentation file and cross-reference it against the diff. Use these generic heuristics +(adapt to whatever project you're in — these are not gstack-specific): + +**README.md:** +- Does it describe all features and capabilities visible in the diff? +- Are install/setup instructions consistent with the changes? +- Are examples, demos, and usage descriptions still valid? +- Are troubleshooting steps still accurate? + +**ARCHITECTURE.md:** +- Do ASCII diagrams and component descriptions match the current code? +- Are design decisions and "why" explanations still accurate? +- Be conservative — only update things clearly contradicted by the diff. Architecture docs + describe things unlikely to change frequently. + +**CONTRIBUTING.md — New contributor smoke test:** +- Walk through the setup instructions as if you are a brand new contributor. +- Are the listed commands accurate? Would each step succeed? +- Do test tier descriptions match the current test infrastructure? +- Are workflow descriptions (dev setup, contributor mode, etc.) current? +- Flag anything that would fail or confuse a first-time contributor. + +**CLAUDE.md / project instructions:** +- Does the project structure section match the actual file tree? +- Are listed commands and scripts accurate? +- Do build/test instructions match what's in package.json (or equivalent)? + +**Any other .md files:** +- Read the file, determine its purpose and audience. +- Cross-reference against the diff to check if it contradicts anything the file says. + +For each file, classify needed updates as: + +- **Auto-update** — Factual corrections clearly warranted by the diff: adding an item to a + table, updating a file path, fixing a count, updating a project structure tree. +- **Ask user** — Narrative changes, section removal, security model changes, large rewrites + (more than ~10 lines in one section), ambiguous relevance, adding entirely new sections. + +--- + +## Step 3: Apply Auto-Updates + +Make all clear, factual updates directly using the Edit tool. + +For each file modified, output a one-line summary describing **what specifically changed** — not +just "Updated README.md" but "README.md: added /new-skill to skills table, updated skill count +from 9 to 10." + +**Never auto-update:** +- README introduction or project positioning +- ARCHITECTURE philosophy or design rationale +- Security model descriptions +- Do not remove entire sections from any document + +--- + +## Step 4: Ask About Risky/Questionable Changes + +For each risky or questionable update identified in Step 2, use AskUserQuestion with: +- Context: project name, branch, which doc file, what we're reviewing +- The specific documentation decision +- `RECOMMENDATION: Choose [X] because [one-line reason]` +- Options including C) Skip — leave as-is + +Apply approved changes immediately after each answer. + +--- + +## Step 5: CHANGELOG Voice Polish + +**CRITICAL — NEVER CLOBBER CHANGELOG ENTRIES.** + +This step polishes voice. It does NOT rewrite, replace, or regenerate CHANGELOG content. + +A real incident occurred where an agent replaced existing CHANGELOG entries when it should have +preserved them. This skill must NEVER do that. + +**Rules:** +1. Read the entire CHANGELOG.md first. Understand what is already there. +2. Only modify wording within existing entries. Never delete, reorder, or replace entries. +3. Never regenerate a CHANGELOG entry from scratch. The entry was written by `/ship` from the + actual diff and commit history. It is the source of truth. You are polishing prose, not + rewriting history. +4. If an entry looks wrong or incomplete, use AskUserQuestion — do NOT silently fix it. +5. Use Edit tool with exact `old_string` matches — never use Write to overwrite CHANGELOG.md. + +**If CHANGELOG was not modified in this branch:** skip this step. + +**If CHANGELOG was modified in this branch**, review the entry for voice: + +- **Sell test:** Would a user reading each bullet think "oh nice, I want to try that"? If not, + rewrite the wording (not the content). +- Lead with what the user can now **do** — not implementation details. +- "You can now..." not "Refactored the..." +- Flag and rewrite any entry that reads like a commit message. +- Internal/contributor changes belong in a separate "### For contributors" subsection. +- Auto-fix minor voice adjustments. Use AskUserQuestion if a rewrite would alter meaning. + +--- + +## Step 6: Cross-Doc Consistency & Discoverability Check + +After auditing each file individually, do a cross-doc consistency pass: + +1. Does the README's feature/capability list match what CLAUDE.md (or project instructions) describes? +2. Does ARCHITECTURE's component list match CONTRIBUTING's project structure description? +3. Does CHANGELOG's latest version match the VERSION file? +4. **Discoverability:** Is every documentation file reachable from README.md or CLAUDE.md? If + ARCHITECTURE.md exists but neither README nor CLAUDE.md links to it, flag it. Every doc + should be discoverable from one of the two entry-point files. +5. Flag any contradictions between documents. Auto-fix clear factual inconsistencies (e.g., a + version mismatch). Use AskUserQuestion for narrative contradictions. + +--- + +## Step 7: TODOS.md Cleanup + +This is a second pass that complements `/ship`'s Step 5.5. Read `review/TODOS-format.md` (if +available) for the canonical TODO item format. + +If TODOS.md does not exist, skip this step. + +1. **Completed items not yet marked:** Cross-reference the diff against open TODO items. If a + TODO is clearly completed by the changes in this branch, move it to the Completed section + with `**Completed:** vX.Y.Z.W (YYYY-MM-DD)`. Be conservative — only mark items with clear + evidence in the diff. + +2. **Items needing description updates:** If a TODO references files or components that were + significantly changed, its description may be stale. Use AskUserQuestion to confirm whether + the TODO should be updated, completed, or left as-is. + +3. **New deferred work:** Check the diff for `TODO`, `FIXME`, `HACK`, and `XXX` comments. For + each one that represents meaningful deferred work (not a trivial inline note), use + AskUserQuestion to ask whether it should be captured in TODOS.md. + +--- + +## Step 8: VERSION Bump Question + +**CRITICAL — NEVER BUMP VERSION WITHOUT ASKING.** + +1. **If VERSION does not exist:** Skip silently. + +2. Check if VERSION was already modified on this branch: + +```bash +git diff ...HEAD -- VERSION +``` + +3. **If VERSION was NOT bumped:** Use AskUserQuestion: + - RECOMMENDATION: Choose C (Skip) because docs-only changes rarely warrant a version bump + - A) Bump PATCH (X.Y.Z+1) — if doc changes ship alongside code changes + - B) Bump MINOR (X.Y+1.0) — if this is a significant standalone release + - C) Skip — no version bump needed + +4. **If VERSION was already bumped:** Do NOT skip silently. Instead, check whether the bump + still covers the full scope of changes on this branch: + + a. Read the CHANGELOG entry for the current VERSION. What features does it describe? + b. Read the full diff (`git diff ...HEAD --stat` and `git diff ...HEAD --name-only`). + Are there significant changes (new features, new skills, new commands, major refactors) + that are NOT mentioned in the CHANGELOG entry for the current version? + c. **If the CHANGELOG entry covers everything:** Skip — output "VERSION: Already bumped to + vX.Y.Z, covers all changes." + d. **If there are significant uncovered changes:** Use AskUserQuestion explaining what the + current version covers vs what's new, and ask: + - RECOMMENDATION: Choose A because the new changes warrant their own version + - A) Bump to next patch (X.Y.Z+1) — give the new changes their own version + - B) Keep current version — add new changes to the existing CHANGELOG entry + - C) Skip — leave version as-is, handle later + + The key insight: a VERSION bump set for "feature A" should not silently absorb "feature B" + if feature B is substantial enough to deserve its own version entry. + +--- + +## Step 9: Commit & Output + +**Empty check first:** Run `git status` (never use `-uall`). If no documentation files were +modified by any previous step, output "All documentation is up to date." and exit without +committing. + +**Commit:** + +1. Stage modified documentation files by name (never `git add -A` or `git add .`). +2. Create a single commit: + +```bash +git commit -m "$(cat <<'EOF' +docs: update project documentation for vX.Y.Z.W + +Co-Authored-By: Claude Opus 4.6 +EOF +)" +``` + +3. Push to the current branch: + +```bash +git push +``` + +**PR body update (idempotent, race-safe):** + +1. Read the existing PR body into a PID-unique tempfile: + +```bash +gh pr view --json body -q .body > /tmp/gstack-pr-body-$$.md +``` + +2. If the tempfile already contains a `## Documentation` section, replace that section with the + updated content. If it does not contain one, append a `## Documentation` section at the end. + +3. The Documentation section should include a **doc diff preview** — for each file modified, + describe what specifically changed (e.g., "README.md: added /document-release to skills + table, updated skill count from 9 to 10"). + +4. Write the updated body back: + +```bash +gh pr edit --body-file /tmp/gstack-pr-body-$$.md +``` + +5. Clean up the tempfile: + +```bash +rm -f /tmp/gstack-pr-body-$$.md +``` + +6. If `gh pr view` fails (no PR exists): skip with message "No PR found — skipping body update." +7. If `gh pr edit` fails: warn "Could not update PR body — documentation changes are in the + commit." and continue. + +**Structured doc health summary (final output):** + +Output a scannable summary showing every documentation file's status: + +``` +Documentation health: + README.md [status] ([details]) + ARCHITECTURE.md [status] ([details]) + CONTRIBUTING.md [status] ([details]) + CHANGELOG.md [status] ([details]) + TODOS.md [status] ([details]) + VERSION [status] ([details]) +``` + +Where status is one of: +- Updated — with description of what changed +- Current — no changes needed +- Voice polished — wording adjusted +- Not bumped — user chose to skip +- Already bumped — version was set by /ship +- Skipped — file does not exist + +--- + +## Important Rules + +- **Read before editing.** Always read the full content of a file before modifying it. +- **Never clobber CHANGELOG.** Polish wording only. Never delete, replace, or regenerate entries. +- **Never bump VERSION silently.** Always ask. Even if already bumped, check whether it covers the full scope of changes. +- **Be explicit about what changed.** Every edit gets a one-line summary. +- **Generic heuristics, not project-specific.** The audit checks work on any repo. +- **Discoverability matters.** Every doc file should be reachable from README or CLAUDE.md. +- **Voice: friendly, user-forward, not obscure.** Write like you're explaining to a smart person + who hasn't seen the code. diff --git a/document-release/SKILL.md.tmpl b/document-release/SKILL.md.tmpl new file mode 100644 index 00000000..e38c243c --- /dev/null +++ b/document-release/SKILL.md.tmpl @@ -0,0 +1,355 @@ +--- +name: document-release +version: 1.0.0 +description: | + Post-ship documentation update. Reads all project docs, cross-references the + diff, updates README/ARCHITECTURE/CONTRIBUTING/CLAUDE.md to match what shipped, + polishes CHANGELOG voice, cleans up TODOS, and optionally bumps VERSION. +allowed-tools: + - Bash + - Read + - Write + - Edit + - Grep + - Glob + - AskUserQuestion +--- + +{{PREAMBLE}} + +{{BASE_BRANCH_DETECT}} + +# Document Release: Post-Ship Documentation Update + +You are running the `/document-release` workflow. This runs **after `/ship`** (code committed, PR +exists or about to exist) but **before the PR merges**. Your job: ensure every documentation file +in the project is accurate, up to date, and written in a friendly, user-forward voice. + +You are mostly automated. Make obvious factual updates directly. Stop and ask only for risky or +subjective decisions. + +**Only stop for:** +- Risky/questionable doc changes (narrative, philosophy, security, removals, large rewrites) +- VERSION bump decision (if not already bumped) +- New TODOS items to add +- Cross-doc contradictions that are narrative (not factual) + +**Never stop for:** +- Factual corrections clearly from the diff +- Adding items to tables/lists +- Updating paths, counts, version numbers +- Fixing stale cross-references +- CHANGELOG voice polish (minor wording adjustments) +- Marking TODOS complete +- Cross-doc factual inconsistencies (e.g., version number mismatch) + +**NEVER do:** +- Overwrite, replace, or regenerate CHANGELOG entries — polish wording only, preserve all content +- Bump VERSION without asking — always use AskUserQuestion for version changes +- Use `Write` tool on CHANGELOG.md — always use `Edit` with exact `old_string` matches + +--- + +## Step 1: Pre-flight & Diff Analysis + +1. Check the current branch. If on the base branch, **abort**: "You're on the base branch. Run from a feature branch." + +2. Gather context about what changed: + +```bash +git diff ...HEAD --stat +``` + +```bash +git log ..HEAD --oneline +``` + +```bash +git diff ...HEAD --name-only +``` + +3. Discover all documentation files in the repo: + +```bash +find . -maxdepth 2 -name "*.md" -not -path "./.git/*" -not -path "./node_modules/*" -not -path "./.gstack/*" -not -path "./.context/*" | sort +``` + +4. Classify the changes into categories relevant to documentation: + - **New features** — new files, new commands, new skills, new capabilities + - **Changed behavior** — modified services, updated APIs, config changes + - **Removed functionality** — deleted files, removed commands + - **Infrastructure** — build system, test infrastructure, CI + +5. Output a brief summary: "Analyzing N files changed across M commits. Found K documentation files to review." + +--- + +## Step 2: Per-File Documentation Audit + +Read each documentation file and cross-reference it against the diff. Use these generic heuristics +(adapt to whatever project you're in — these are not gstack-specific): + +**README.md:** +- Does it describe all features and capabilities visible in the diff? +- Are install/setup instructions consistent with the changes? +- Are examples, demos, and usage descriptions still valid? +- Are troubleshooting steps still accurate? + +**ARCHITECTURE.md:** +- Do ASCII diagrams and component descriptions match the current code? +- Are design decisions and "why" explanations still accurate? +- Be conservative — only update things clearly contradicted by the diff. Architecture docs + describe things unlikely to change frequently. + +**CONTRIBUTING.md — New contributor smoke test:** +- Walk through the setup instructions as if you are a brand new contributor. +- Are the listed commands accurate? Would each step succeed? +- Do test tier descriptions match the current test infrastructure? +- Are workflow descriptions (dev setup, contributor mode, etc.) current? +- Flag anything that would fail or confuse a first-time contributor. + +**CLAUDE.md / project instructions:** +- Does the project structure section match the actual file tree? +- Are listed commands and scripts accurate? +- Do build/test instructions match what's in package.json (or equivalent)? + +**Any other .md files:** +- Read the file, determine its purpose and audience. +- Cross-reference against the diff to check if it contradicts anything the file says. + +For each file, classify needed updates as: + +- **Auto-update** — Factual corrections clearly warranted by the diff: adding an item to a + table, updating a file path, fixing a count, updating a project structure tree. +- **Ask user** — Narrative changes, section removal, security model changes, large rewrites + (more than ~10 lines in one section), ambiguous relevance, adding entirely new sections. + +--- + +## Step 3: Apply Auto-Updates + +Make all clear, factual updates directly using the Edit tool. + +For each file modified, output a one-line summary describing **what specifically changed** — not +just "Updated README.md" but "README.md: added /new-skill to skills table, updated skill count +from 9 to 10." + +**Never auto-update:** +- README introduction or project positioning +- ARCHITECTURE philosophy or design rationale +- Security model descriptions +- Do not remove entire sections from any document + +--- + +## Step 4: Ask About Risky/Questionable Changes + +For each risky or questionable update identified in Step 2, use AskUserQuestion with: +- Context: project name, branch, which doc file, what we're reviewing +- The specific documentation decision +- `RECOMMENDATION: Choose [X] because [one-line reason]` +- Options including C) Skip — leave as-is + +Apply approved changes immediately after each answer. + +--- + +## Step 5: CHANGELOG Voice Polish + +**CRITICAL — NEVER CLOBBER CHANGELOG ENTRIES.** + +This step polishes voice. It does NOT rewrite, replace, or regenerate CHANGELOG content. + +A real incident occurred where an agent replaced existing CHANGELOG entries when it should have +preserved them. This skill must NEVER do that. + +**Rules:** +1. Read the entire CHANGELOG.md first. Understand what is already there. +2. Only modify wording within existing entries. Never delete, reorder, or replace entries. +3. Never regenerate a CHANGELOG entry from scratch. The entry was written by `/ship` from the + actual diff and commit history. It is the source of truth. You are polishing prose, not + rewriting history. +4. If an entry looks wrong or incomplete, use AskUserQuestion — do NOT silently fix it. +5. Use Edit tool with exact `old_string` matches — never use Write to overwrite CHANGELOG.md. + +**If CHANGELOG was not modified in this branch:** skip this step. + +**If CHANGELOG was modified in this branch**, review the entry for voice: + +- **Sell test:** Would a user reading each bullet think "oh nice, I want to try that"? If not, + rewrite the wording (not the content). +- Lead with what the user can now **do** — not implementation details. +- "You can now..." not "Refactored the..." +- Flag and rewrite any entry that reads like a commit message. +- Internal/contributor changes belong in a separate "### For contributors" subsection. +- Auto-fix minor voice adjustments. Use AskUserQuestion if a rewrite would alter meaning. + +--- + +## Step 6: Cross-Doc Consistency & Discoverability Check + +After auditing each file individually, do a cross-doc consistency pass: + +1. Does the README's feature/capability list match what CLAUDE.md (or project instructions) describes? +2. Does ARCHITECTURE's component list match CONTRIBUTING's project structure description? +3. Does CHANGELOG's latest version match the VERSION file? +4. **Discoverability:** Is every documentation file reachable from README.md or CLAUDE.md? If + ARCHITECTURE.md exists but neither README nor CLAUDE.md links to it, flag it. Every doc + should be discoverable from one of the two entry-point files. +5. Flag any contradictions between documents. Auto-fix clear factual inconsistencies (e.g., a + version mismatch). Use AskUserQuestion for narrative contradictions. + +--- + +## Step 7: TODOS.md Cleanup + +This is a second pass that complements `/ship`'s Step 5.5. Read `review/TODOS-format.md` (if +available) for the canonical TODO item format. + +If TODOS.md does not exist, skip this step. + +1. **Completed items not yet marked:** Cross-reference the diff against open TODO items. If a + TODO is clearly completed by the changes in this branch, move it to the Completed section + with `**Completed:** vX.Y.Z.W (YYYY-MM-DD)`. Be conservative — only mark items with clear + evidence in the diff. + +2. **Items needing description updates:** If a TODO references files or components that were + significantly changed, its description may be stale. Use AskUserQuestion to confirm whether + the TODO should be updated, completed, or left as-is. + +3. **New deferred work:** Check the diff for `TODO`, `FIXME`, `HACK`, and `XXX` comments. For + each one that represents meaningful deferred work (not a trivial inline note), use + AskUserQuestion to ask whether it should be captured in TODOS.md. + +--- + +## Step 8: VERSION Bump Question + +**CRITICAL — NEVER BUMP VERSION WITHOUT ASKING.** + +1. **If VERSION does not exist:** Skip silently. + +2. Check if VERSION was already modified on this branch: + +```bash +git diff ...HEAD -- VERSION +``` + +3. **If VERSION was NOT bumped:** Use AskUserQuestion: + - RECOMMENDATION: Choose C (Skip) because docs-only changes rarely warrant a version bump + - A) Bump PATCH (X.Y.Z+1) — if doc changes ship alongside code changes + - B) Bump MINOR (X.Y+1.0) — if this is a significant standalone release + - C) Skip — no version bump needed + +4. **If VERSION was already bumped:** Do NOT skip silently. Instead, check whether the bump + still covers the full scope of changes on this branch: + + a. Read the CHANGELOG entry for the current VERSION. What features does it describe? + b. Read the full diff (`git diff ...HEAD --stat` and `git diff ...HEAD --name-only`). + Are there significant changes (new features, new skills, new commands, major refactors) + that are NOT mentioned in the CHANGELOG entry for the current version? + c. **If the CHANGELOG entry covers everything:** Skip — output "VERSION: Already bumped to + vX.Y.Z, covers all changes." + d. **If there are significant uncovered changes:** Use AskUserQuestion explaining what the + current version covers vs what's new, and ask: + - RECOMMENDATION: Choose A because the new changes warrant their own version + - A) Bump to next patch (X.Y.Z+1) — give the new changes their own version + - B) Keep current version — add new changes to the existing CHANGELOG entry + - C) Skip — leave version as-is, handle later + + The key insight: a VERSION bump set for "feature A" should not silently absorb "feature B" + if feature B is substantial enough to deserve its own version entry. + +--- + +## Step 9: Commit & Output + +**Empty check first:** Run `git status` (never use `-uall`). If no documentation files were +modified by any previous step, output "All documentation is up to date." and exit without +committing. + +**Commit:** + +1. Stage modified documentation files by name (never `git add -A` or `git add .`). +2. Create a single commit: + +```bash +git commit -m "$(cat <<'EOF' +docs: update project documentation for vX.Y.Z.W + +Co-Authored-By: Claude Opus 4.6 +EOF +)" +``` + +3. Push to the current branch: + +```bash +git push +``` + +**PR body update (idempotent, race-safe):** + +1. Read the existing PR body into a PID-unique tempfile: + +```bash +gh pr view --json body -q .body > /tmp/gstack-pr-body-$$.md +``` + +2. If the tempfile already contains a `## Documentation` section, replace that section with the + updated content. If it does not contain one, append a `## Documentation` section at the end. + +3. The Documentation section should include a **doc diff preview** — for each file modified, + describe what specifically changed (e.g., "README.md: added /document-release to skills + table, updated skill count from 9 to 10"). + +4. Write the updated body back: + +```bash +gh pr edit --body-file /tmp/gstack-pr-body-$$.md +``` + +5. Clean up the tempfile: + +```bash +rm -f /tmp/gstack-pr-body-$$.md +``` + +6. If `gh pr view` fails (no PR exists): skip with message "No PR found — skipping body update." +7. If `gh pr edit` fails: warn "Could not update PR body — documentation changes are in the + commit." and continue. + +**Structured doc health summary (final output):** + +Output a scannable summary showing every documentation file's status: + +``` +Documentation health: + README.md [status] ([details]) + ARCHITECTURE.md [status] ([details]) + CONTRIBUTING.md [status] ([details]) + CHANGELOG.md [status] ([details]) + TODOS.md [status] ([details]) + VERSION [status] ([details]) +``` + +Where status is one of: +- Updated — with description of what changed +- Current — no changes needed +- Voice polished — wording adjusted +- Not bumped — user chose to skip +- Already bumped — version was set by /ship +- Skipped — file does not exist + +--- + +## Important Rules + +- **Read before editing.** Always read the full content of a file before modifying it. +- **Never clobber CHANGELOG.** Polish wording only. Never delete, replace, or regenerate entries. +- **Never bump VERSION silently.** Always ask. Even if already bumped, check whether it covers the full scope of changes. +- **Be explicit about what changed.** Every edit gets a one-line summary. +- **Generic heuristics, not project-specific.** The audit checks work on any repo. +- **Discoverability matters.** Every doc file should be reachable from README or CLAUDE.md. +- **Voice: friendly, user-forward, not obscure.** Write like you're explaining to a smart person + who hasn't seen the code. diff --git a/review/SKILL.md b/review/SKILL.md index 285a4922..c111e57a 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -206,6 +206,20 @@ If TODOS.md doesn't exist, skip this step silently. --- +## Step 5.6: Documentation staleness check + +Cross-reference the diff against documentation files. For each `.md` file in the repo root (README.md, ARCHITECTURE.md, CONTRIBUTING.md, CLAUDE.md, etc.): + +1. Check if code changes in the diff affect features, components, or workflows described in that doc file. +2. If the doc file was NOT updated in this branch but the code it describes WAS changed, flag it as an INFORMATIONAL finding: + "Documentation may be stale: [file] describes [feature/component] but code changed in this branch. Consider running `/document-release`." + +This is informational only — never critical. The fix action is `/document-release`. + +If no documentation files exist, skip this step silently. + +--- + ## Important Rules - **Read the FULL diff before commenting.** Do not flag issues already addressed in the diff. diff --git a/review/SKILL.md.tmpl b/review/SKILL.md.tmpl index dadd211a..f13dae4f 100644 --- a/review/SKILL.md.tmpl +++ b/review/SKILL.md.tmpl @@ -124,6 +124,20 @@ If TODOS.md doesn't exist, skip this step silently. --- +## Step 5.6: Documentation staleness check + +Cross-reference the diff against documentation files. For each `.md` file in the repo root (README.md, ARCHITECTURE.md, CONTRIBUTING.md, CLAUDE.md, etc.): + +1. Check if code changes in the diff affect features, components, or workflows described in that doc file. +2. If the doc file was NOT updated in this branch but the code it describes WAS changed, flag it as an INFORMATIONAL finding: + "Documentation may be stale: [file] describes [feature/component] but code changed in this branch. Consider running `/document-release`." + +This is informational only — never critical. The fix action is `/document-release`. + +If no documentation files exist, skip this step silently. + +--- + ## Important Rules - **Read the FULL diff before commenting.** Do not flag issues already addressed in the diff. diff --git a/scripts/gen-skill-docs.ts b/scripts/gen-skill-docs.ts index 9d79189a..ebe4f54f 100644 --- a/scripts/gen-skill-docs.ts +++ b/scripts/gen-skill-docs.ts @@ -539,6 +539,7 @@ function findTemplates(): string[] { path.join(ROOT, 'plan-eng-review', 'SKILL.md.tmpl'), path.join(ROOT, 'retro', 'SKILL.md.tmpl'), path.join(ROOT, 'gstack-upgrade', 'SKILL.md.tmpl'), + path.join(ROOT, 'document-release', 'SKILL.md.tmpl'), ]; for (const p of candidates) { if (fs.existsSync(p)) templates.push(p); diff --git a/scripts/skill-check.ts b/scripts/skill-check.ts index 591a0c81..51a583a2 100644 --- a/scripts/skill-check.ts +++ b/scripts/skill-check.ts @@ -27,6 +27,8 @@ const SKILL_FILES = [ 'plan-ceo-review/SKILL.md', 'plan-eng-review/SKILL.md', 'setup-browser-cookies/SKILL.md', + 'gstack-upgrade/SKILL.md', + 'document-release/SKILL.md', ].filter(f => fs.existsSync(path.join(ROOT, f))); let hasErrors = false; diff --git a/test/skill-e2e.test.ts b/test/skill-e2e.test.ts index aa50a976..37b42de8 100644 --- a/test/skill-e2e.test.ts +++ b/test/skill-e2e.test.ts @@ -1546,6 +1546,110 @@ Write your retrospective to ${dir}/retro-output.md`, }, 300_000); }); +// --- Document-Release skill E2E --- + +describeE2E('Document-Release skill E2E', () => { + let docReleaseDir: string; + + beforeAll(() => { + docReleaseDir = fs.mkdtempSync(path.join(os.tmpdir(), 'skill-e2e-doc-release-')); + + // Copy document-release skill files + copyDirSync(path.join(ROOT, 'document-release'), path.join(docReleaseDir, 'document-release')); + + // Init git repo with initial docs + const run = (cmd: string, args: string[]) => + spawnSync(cmd, args, { cwd: docReleaseDir, stdio: 'pipe', timeout: 5000 }); + + run('git', ['init']); + run('git', ['config', 'user.email', 'test@test.com']); + run('git', ['config', 'user.name', 'Test']); + + // Create initial README with a features list + fs.writeFileSync(path.join(docReleaseDir, 'README.md'), + '# Test Project\n\n## Features\n\n- Feature A\n- Feature B\n\n## Install\n\n```bash\nnpm install\n```\n'); + + // Create initial CHANGELOG that must NOT be clobbered + fs.writeFileSync(path.join(docReleaseDir, 'CHANGELOG.md'), + '# Changelog\n\n## 1.0.0 — 2026-03-01\n\n- Initial release with Feature A and Feature B\n- Setup CI pipeline\n'); + + // Create VERSION file (already bumped) + fs.writeFileSync(path.join(docReleaseDir, 'VERSION'), '1.1.0\n'); + + run('git', ['add', '.']); + run('git', ['commit', '-m', 'initial']); + + // Create feature branch with a code change + run('git', ['checkout', '-b', 'feat/add-feature-c']); + fs.writeFileSync(path.join(docReleaseDir, 'feature-c.ts'), 'export function featureC() { return "C"; }\n'); + fs.writeFileSync(path.join(docReleaseDir, 'VERSION'), '1.1.1\n'); + fs.writeFileSync(path.join(docReleaseDir, 'CHANGELOG.md'), + '# Changelog\n\n## 1.1.1 — 2026-03-16\n\n- Added Feature C\n\n## 1.0.0 — 2026-03-01\n\n- Initial release with Feature A and Feature B\n- Setup CI pipeline\n'); + run('git', ['add', '.']); + run('git', ['commit', '-m', 'feat: add feature C']); + }); + + afterAll(() => { + try { fs.rmSync(docReleaseDir, { recursive: true, force: true }); } catch {} + }); + + test('/document-release updates docs without clobbering CHANGELOG', async () => { + const result = await runSkillTest({ + prompt: `Read the file document-release/SKILL.md for the document-release workflow instructions. + +Run the /document-release workflow on this repo. The base branch is "main". + +IMPORTANT: +- Do NOT use AskUserQuestion — auto-approve everything or skip if unsure. +- Do NOT push or create PRs (there is no remote). +- Do NOT run gh commands (no remote). +- Focus on updating README.md to reflect the new Feature C. +- Do NOT overwrite or regenerate CHANGELOG entries. +- Skip VERSION bump (it's already bumped). +- After editing, just commit the changes locally.`, + workingDirectory: docReleaseDir, + maxTurns: 30, + allowedTools: ['Bash', 'Read', 'Write', 'Edit', 'Grep', 'Glob'], + timeout: 180_000, + testName: 'document-release', + runId, + }); + + logCost('/document-release', result); + + // Read CHANGELOG to verify it was NOT clobbered + const changelog = fs.readFileSync(path.join(docReleaseDir, 'CHANGELOG.md'), 'utf-8'); + const hasOriginalEntries = changelog.includes('Initial release with Feature A and Feature B') + && changelog.includes('Setup CI pipeline') + && changelog.includes('1.0.0'); + if (!hasOriginalEntries) { + console.warn('CHANGELOG CLOBBERED — original entries missing!'); + } + + // Check if README was updated + const readme = fs.readFileSync(path.join(docReleaseDir, 'README.md'), 'utf-8'); + const readmeUpdated = readme.includes('Feature C') || readme.includes('feature-c') || readme.includes('feature C'); + + const exitOk = ['success', 'error_max_turns'].includes(result.exitReason); + recordE2E('/document-release', 'Document-Release skill E2E', result, { + passed: exitOk && hasOriginalEntries, + }); + + // Critical guardrail: CHANGELOG must not be clobbered + expect(hasOriginalEntries).toBe(true); + + // Accept error_max_turns — thorough doc review is not a failure + expect(['success', 'error_max_turns']).toContain(result.exitReason); + + // Informational: did it update README? + if (readmeUpdated) { + console.log('README updated to include Feature C'); + } else { + console.warn('README was NOT updated — agent may not have found the feature'); + } + }, 240_000); +}); + // --- Deferred skill E2E tests (destructive or require interactive UI) --- describeE2E('Deferred skill E2E', () => { diff --git a/test/skill-validation.test.ts b/test/skill-validation.test.ts index 77f09ee0..a1817ede 100644 --- a/test/skill-validation.test.ts +++ b/test/skill-validation.test.ts @@ -176,6 +176,7 @@ describe('Update check preamble', () => { 'ship/SKILL.md', 'review/SKILL.md', 'plan-ceo-review/SKILL.md', 'plan-eng-review/SKILL.md', 'retro/SKILL.md', + 'document-release/SKILL.md', ]; for (const skill of skillsWithUpdateCheck) { @@ -397,6 +398,7 @@ describe('No hardcoded branch names in SKILL templates', () => { 'qa/SKILL.md.tmpl', 'plan-ceo-review/SKILL.md.tmpl', 'retro/SKILL.md.tmpl', + 'document-release/SKILL.md.tmpl', ]; // Patterns that indicate hardcoded 'main' in git commands @@ -479,6 +481,7 @@ describe('v0.4.1 preamble features', () => { 'ship/SKILL.md', 'review/SKILL.md', 'plan-ceo-review/SKILL.md', 'plan-eng-review/SKILL.md', 'retro/SKILL.md', + 'document-release/SKILL.md', ]; for (const skill of skillsWithPreamble) { From c86faa7968688993406fd4c6cd9550d58306408f Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Mon, 16 Mar 2026 14:14:15 -0500 Subject: [PATCH 05/22] =?UTF-8?q?fix:=20update=20check=20cache=20=E2=80=94?= =?UTF-8?q?=2060min=20UP=5FTO=5FDATE=20TTL=20+=20--force=20flag=20(v0.4.4)?= =?UTF-8?q?=20(#110)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix: split update check cache TTL + add --force flag UP_TO_DATE cache now expires after 60 min (was 720 min / 12 hours). UPGRADE_AVAILABLE keeps 720 min TTL to keep nagging. --force flag deletes cache before checking, used by /gstack-upgrade standalone invocation to always get a fresh result from GitHub. Co-Authored-By: Claude Opus 4.6 (1M context) * feat: /gstack-upgrade standalone uses --force for fresh check Co-Authored-By: Claude Opus 4.6 (1M context) * chore: bump version and changelog (v0.4.4) Co-Authored-By: Claude Opus 4.6 (1M context) --------- Co-authored-by: Claude Opus 4.6 (1M context) --- CHANGELOG.md | 11 +++++ VERSION | 2 +- bin/gstack-update-check | 27 +++++++----- browse/test/gstack-update-check.test.ts | 58 +++++++++++++++++++++++-- gstack-upgrade/SKILL.md | 11 ++++- gstack-upgrade/SKILL.md.tmpl | 11 ++++- 6 files changed, 104 insertions(+), 16 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 3c3fcceb..d61ba4f2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,16 @@ # Changelog +## 0.4.4 — 2026-03-16 + +- **New releases detected in under an hour, not half a day.** The update check cache was set to 12 hours, which meant you could be stuck on an old version all day while new releases dropped. Now "you're up to date" expires after 60 minutes, so you'll see upgrades within the hour. "Upgrade available" still nags for 12 hours (that's the point). +- **`/gstack-upgrade` always checks for real.** Running `/gstack-upgrade` directly now bypasses the cache and does a fresh check against GitHub. No more "you're already on the latest" when you're not. + +### For contributors + +- Split `last-update-check` cache TTL: 60 min for `UP_TO_DATE`, 720 min for `UPGRADE_AVAILABLE`. +- Added `--force` flag to `bin/gstack-update-check` (deletes cache file before checking). +- 3 new tests: `--force` busts UP_TO_DATE cache, `--force` busts UPGRADE_AVAILABLE cache, 60-min TTL boundary test with `utimesSync`. + ## 0.4.3 — 2026-03-16 - **New `/document-release` skill.** Run it after `/ship` but before merging — it reads every doc file in your project, cross-references the diff, and updates README, ARCHITECTURE, CONTRIBUTING, CHANGELOG, and TODOS to match what you actually shipped. Risky changes get surfaced as questions; everything else is automatic. diff --git a/VERSION b/VERSION index 17b2ccd9..6f2743d6 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -0.4.3 +0.4.4 diff --git a/bin/gstack-update-check b/bin/gstack-update-check index 7c5e5ca0..d44c7e0f 100755 --- a/bin/gstack-update-check +++ b/bin/gstack-update-check @@ -20,6 +20,11 @@ SNOOZE_FILE="$STATE_DIR/update-snoozed" VERSION_FILE="$GSTACK_DIR/VERSION" REMOTE_URL="${GSTACK_REMOTE_URL:-https://raw.githubusercontent.com/garrytan/gstack/main/VERSION}" +# ─── Force flag (busts cache for standalone /gstack-upgrade) ── +if [ "${1:-}" = "--force" ]; then + rm -f "$CACHE_FILE" +fi + # ─── Step 0: Check if updates are disabled ──────────────────── _UC=$("$GSTACK_DIR/bin/gstack-config" get update_check 2>/dev/null || true) if [ "$_UC" = "false" ]; then @@ -97,24 +102,27 @@ if [ -f "$MARKER_FILE" ]; then exit 0 fi -# ─── Step 3: Check cache freshness (12h = 720 min) ────────── +# ─── Step 3: Check cache freshness ────────────────────────── +# UP_TO_DATE: 60 min TTL (detect new releases quickly) +# UPGRADE_AVAILABLE: 720 min TTL (keep nagging) if [ -f "$CACHE_FILE" ]; then - # Cache is fresh if modified within 720 minutes - STALE=$(find "$CACHE_FILE" -mmin +720 2>/dev/null || true) - if [ -z "$STALE" ]; then - # Cache is fresh — read it - CACHED="$(cat "$CACHE_FILE" 2>/dev/null || true)" + CACHED="$(cat "$CACHE_FILE" 2>/dev/null || true)" + case "$CACHED" in + UP_TO_DATE*) CACHE_TTL=60 ;; + UPGRADE_AVAILABLE*) CACHE_TTL=720 ;; + *) CACHE_TTL=0 ;; # corrupt → force re-fetch + esac + + STALE=$(find "$CACHE_FILE" -mmin +$CACHE_TTL 2>/dev/null || true) + if [ -z "$STALE" ] && [ "$CACHE_TTL" -gt 0 ]; then case "$CACHED" in UP_TO_DATE*) - # Verify local version still matches cached version CACHED_VER="$(echo "$CACHED" | awk '{print $2}')" if [ "$CACHED_VER" = "$LOCAL" ]; then exit 0 fi - # Local version changed — fall through to re-check ;; UPGRADE_AVAILABLE*) - # Verify local version still matches cached old version CACHED_OLD="$(echo "$CACHED" | awk '{print $2}')" if [ "$CACHED_OLD" = "$LOCAL" ]; then CACHED_NEW="$(echo "$CACHED" | awk '{print $3}')" @@ -124,7 +132,6 @@ if [ -f "$CACHE_FILE" ]; then echo "$CACHED" exit 0 fi - # Local version changed (manual upgrade?) — fall through to re-check ;; esac fi diff --git a/browse/test/gstack-update-check.test.ts b/browse/test/gstack-update-check.test.ts index 2ec70e2d..66239931 100644 --- a/browse/test/gstack-update-check.test.ts +++ b/browse/test/gstack-update-check.test.ts @@ -7,7 +7,7 @@ */ import { describe, test, expect, beforeEach, afterEach } from 'bun:test'; -import { mkdtempSync, writeFileSync, rmSync, existsSync, readFileSync, mkdirSync, symlinkSync } from 'fs'; +import { mkdtempSync, writeFileSync, rmSync, existsSync, readFileSync, mkdirSync, symlinkSync, utimesSync } from 'fs'; import { join } from 'path'; import { tmpdir } from 'os'; @@ -16,8 +16,8 @@ const SCRIPT = join(import.meta.dir, '..', '..', 'bin', 'gstack-update-check'); let gstackDir: string; let stateDir: string; -function run(extraEnv: Record = {}) { - const result = Bun.spawnSync(['bash', SCRIPT], { +function run(extraEnv: Record = {}, args: string[] = []) { + const result = Bun.spawnSync(['bash', SCRIPT, ...args], { env: { ...process.env, GSTACK_DIR: gstackDir, @@ -412,4 +412,56 @@ describe('gstack-update-check', () => { expect(exitCode).toBe(0); expect(stdout).toBe('UPGRADE_AVAILABLE 0.3.3 0.4.0'); }); + + // ─── --force flag tests ────────────────────────────────────── + + test('--force busts fresh UP_TO_DATE cache', () => { + writeFileSync(join(gstackDir, 'VERSION'), '0.3.3\n'); + writeFileSync(join(gstackDir, 'REMOTE_VERSION'), '0.4.0\n'); + writeFileSync(join(stateDir, 'last-update-check'), 'UP_TO_DATE 0.3.3'); + + // Without --force: cache hit, silent + const cached = run(); + expect(cached.stdout).toBe(''); + + // With --force: cache busted, re-fetches, finds upgrade + const forced = run({}, ['--force']); + expect(forced.exitCode).toBe(0); + expect(forced.stdout).toBe('UPGRADE_AVAILABLE 0.3.3 0.4.0'); + }); + + test('--force busts fresh UPGRADE_AVAILABLE cache', () => { + writeFileSync(join(gstackDir, 'VERSION'), '0.3.3\n'); + writeFileSync(join(gstackDir, 'REMOTE_VERSION'), '0.3.3\n'); + writeFileSync(join(stateDir, 'last-update-check'), 'UPGRADE_AVAILABLE 0.3.3 0.4.0'); + + // Without --force: cache hit, outputs stale upgrade + const cached = run(); + expect(cached.stdout).toBe('UPGRADE_AVAILABLE 0.3.3 0.4.0'); + + // With --force: cache busted, re-fetches, now up to date + const forced = run({}, ['--force']); + expect(forced.exitCode).toBe(0); + expect(forced.stdout).toBe(''); + const cache = readFileSync(join(stateDir, 'last-update-check'), 'utf-8'); + expect(cache).toContain('UP_TO_DATE'); + }); + + // ─── Split TTL tests ───────────────────────────────────────── + + test('UP_TO_DATE cache expires after 60 min (not 720)', () => { + writeFileSync(join(gstackDir, 'VERSION'), '0.3.3\n'); + writeFileSync(join(gstackDir, 'REMOTE_VERSION'), '0.4.0\n'); + writeFileSync(join(stateDir, 'last-update-check'), 'UP_TO_DATE 0.3.3'); + + // Set cache mtime to 90 minutes ago (past 60-min TTL) + const ninetyMinAgo = new Date(Date.now() - 90 * 60 * 1000); + const cachePath = join(stateDir, 'last-update-check'); + utimesSync(cachePath, ninetyMinAgo, ninetyMinAgo); + + // Cache should be stale at 60-min TTL, re-fetches and finds upgrade + const { exitCode, stdout } = run(); + expect(exitCode).toBe(0); + expect(stdout).toBe('UPGRADE_AVAILABLE 0.3.3 0.4.0'); + }); }); diff --git a/gstack-upgrade/SKILL.md b/gstack-upgrade/SKILL.md index 42f13f69..9d635651 100644 --- a/gstack-upgrade/SKILL.md +++ b/gstack-upgrade/SKILL.md @@ -189,4 +189,13 @@ After showing What's New, continue with whatever skill the user originally invok ## Standalone usage -When invoked directly as `/gstack-upgrade` (not from a preamble), follow Steps 2-6 above. If already on the latest version, tell the user: "You're already on the latest version (v{version})." +When invoked directly as `/gstack-upgrade` (not from a preamble): + +1. Force a fresh update check (bypass cache): +```bash +~/.claude/skills/gstack/bin/gstack-update-check --force +``` +Use the output to determine if an upgrade is available. + +2. If `UPGRADE_AVAILABLE `: follow Steps 2-6 above. +3. If no output (up to date): tell the user "You're already on the latest version (v{version})." diff --git a/gstack-upgrade/SKILL.md.tmpl b/gstack-upgrade/SKILL.md.tmpl index a199db6c..a441b8d6 100644 --- a/gstack-upgrade/SKILL.md.tmpl +++ b/gstack-upgrade/SKILL.md.tmpl @@ -187,4 +187,13 @@ After showing What's New, continue with whatever skill the user originally invok ## Standalone usage -When invoked directly as `/gstack-upgrade` (not from a preamble), follow Steps 2-6 above. If already on the latest version, tell the user: "You're already on the latest version (v{version})." +When invoked directly as `/gstack-upgrade` (not from a preamble): + +1. Force a fresh update check (bypass cache): +```bash +~/.claude/skills/gstack/bin/gstack-update-check --force +``` +Use the output to determine if an upgrade is available. + +2. If `UPGRADE_AVAILABLE `: follow Steps 2-6 above. +3. If no output (up to date): tell the user "You're already on the latest version (v{version})." From 318ffdbdf01e9c6ad78ab41852803d955e9dc092 Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Mon, 16 Mar 2026 21:50:43 -0500 Subject: [PATCH 06/22] fix: js statement wrapping + click auto-routes option to selectOption (v0.4.5) (#117) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix: js statement wrapping + click auto-routes option to selectOption Bug 1: js command wrapped all code as expressions — const, semicolons, and multi-line code broke with SyntaxError. Added needsBlockWrapper() and wrapForEvaluate() helpers (shared with eval) to detect statements and use block wrapper {…} instead of expression wrapper (…). Bug 2: clicking