diff --git a/README.md b/README.md index 7ea0b90..9f9f1bf 100644 --- a/README.md +++ b/README.md @@ -124,6 +124,7 @@ Install the skills from https://github.com/exiao/skills | Skill | Description | |-------|-------------| | [agent-browser](agent-browser/) | Automate browsers via agent-browser CLI | +| [babysit-open-prs](babysit-open-prs/) | Scan open PRs across tracked repos, triage, and spawn babysit-pr sub-agents | | [babysit-pr](babysit-pr/) | Monitor a PR through CI, reviews, and fixes until it's ready to merge | | [bloom-cli](bloom-cli/) | Fetch stock data, fundamentals, earnings, SEC filings via Bloom CLI | | [claude-md-management](claude-md-management/) | Audit, improve, and maintain CLAUDE.md files across repos | diff --git a/babysit-open-prs/SKILL.md b/babysit-open-prs/SKILL.md new file mode 100644 index 0000000..afe99ff --- /dev/null +++ b/babysit-open-prs/SKILL.md @@ -0,0 +1,106 @@ +--- +name: babysit-open-prs +description: "Scan all open PRs across tracked repos, triage them, check for scope drift, and spawn babysit-pr sub-agents for fixable ones. Use when: babysit all PRs, check all open PRs, nightly PR review." +--- + +# Babysit Open PRs + +Scan open PRs across tracked repos, triage each one (scope check + CI + reviews), and spawn `babysit-pr` sub-agents for PRs that need fixing. Report results. + +## Step 1: Preflight + +Run the preflight script to discover PRs that need attention: + +```bash +bash ~/clawd/scripts/pr-preflight.sh +``` + +This scans bloom-invest/bloom, bloom-invest/investing-log, exiao/skills, plus other repos under bloom-invest, prompt-pm, and exiao orgs with open PRs by exiao. It handles skip state and deduplication. + +If no output: no PRs need attention. Reply NO_REPLY. + +## Step 2: Triage (do this yourself, do NOT spawn sub-agents yet) + +For each PR in the preflight output, gather context: + +```bash +# CI and merge status +gh pr view --repo --json title,body,statusCheckRollup,reviewDecision,headRefName,mergeable,commits + +# Check results +gh pr checks --repo + +# Commit count +gh api "repos//pulls//commits?per_page=100" --jq 'length' + +# Changed files (for scope check) +gh pr diff --repo --stat +``` + +### Scope Check (per PR) + +Compare the changed files and commit messages against the PR title and description: + +1. Do the files relate to the PR's stated purpose? +2. Are there commits that introduce unrelated work? +3. Is there bulk formatting noise beyond the PR's actual changes? +4. Are multiple distinct features bundled together? + +### Classify each PR: + +- **CLEAN**: CI green, no unaddressed comments, scope is tight. No sub-agent needed. +- **FIXABLE**: CI failure with identifiable root cause, or unaddressed review comments pointing to real bugs, or merge conflicts with clear resolution. Scope is acceptable. Spawn a sub-agent. +- **SCOPE_DRIFT**: PR includes changes that don't match its description. Commits touch unrelated files, bundle multiple features, or include unnecessary formatting noise. Do NOT spawn a sub-agent. Report what's wrong and recommend how to fix (split PRs, revert commits, etc.). +- **SKIP**: Merge conflicts needing design decisions, architectural issues, or draft/WIP PRs. Note for the report but do NOT spawn a sub-agent. + +## Step 3: Spawn sub-agents for FIXABLE PRs only + +For each fixable PR (max 5), spawn a sub-agent using the babysit-pr skill: + +``` +sessions_spawn({ + task: "Use the babysit-pr skill. PR #, repo . Max cycles: 5. Reasons flagged: . Running via nightly cron (use pr-mark-skip.sh if escalating). Read the repo's CLAUDE.md/AGENTS.md first.", + cwd: "", + run_timeout_seconds: 1800 +}) +``` + +Local paths: +- bloom-invest/bloom → ~/bloom +- bloom-invest/investing-log → ~/clawd/investing-log +- exiao/skills → ~/clawd/skills +- Fintary/ops-center → ~/fintary/ops-center +- Other repos → check ~/clawd/ or ~/, clone to /tmp/ if not found + +Wait for all sub-agents to complete. + +## Step 4: Report + +Do NOT send via the message tool. Output the summary as your reply. Cron delivery handles routing. + +**Format:** + +``` +🔧 Nightly PR Babysit — [date] + +[For each PR, one of:] + +✅ PR #N (repo): title — clean +🔧 PR #N (repo): title — fixed (what was done) +⚠️ PR #N (repo): title — scope drift + Description says: X + Actually includes: Y, Z + Recommendation: split/revert/remove +🚫 PR #N (repo): title — blocked (why) +⏭️ PR #N (repo): title — skipped (why) +``` + +If all PRs were already clean, keep it brief. + +## Gotchas + +- **Scope check is mandatory.** Every PR gets checked for drift, even if CI is green. A green CI on a bloated PR is still a problem. +- **Don't fix scope drift.** Splitting PRs, reverting commits, or removing files from a PR requires human judgment on what belongs where. Always escalate. +- **Max 5 sub-agents.** If more than 5 PRs are fixable, prioritize by: CI failures first, then review comments, then oldest. +- **Preflight script handles skip state.** If a PR was previously marked as skip (via pr-mark-skip.sh), it won't appear in the preflight output. +- **Sub-agents run babysit-pr which includes its own scope check.** The triage-level scope check here is a quick pass; babysit-pr does a deeper per-commit analysis. diff --git a/babysit-pr/SKILL.md b/babysit-pr/SKILL.md index 22f1754..360dbfc 100644 --- a/babysit-pr/SKILL.md +++ b/babysit-pr/SKILL.md @@ -1,17 +1,17 @@ --- name: babysit-pr -description: "Monitor a PR until it's ready to merge. Watches CI, reads reviews, fixes issues, and repeats. Use when: babysit this PR, watch this PR, monitor PR, fix and watch PR, keep this PR green." +description: "Monitor a PR until it's ready to merge. Watches CI, reads reviews, checks scope, fixes issues, and repeats. Use when: babysit this PR, watch this PR, monitor PR, fix and watch PR, keep this PR green." --- # Babysit PR -Monitor a single PR through its full lifecycle: wait for CI, read reviews, fix issues, push, repeat. Stop when the PR is clean (CI green, no unaddressed comments) or when you hit a wall that needs human input. +Monitor a single PR through its full lifecycle: check scope, wait for CI, read reviews, fix issues, push, repeat. Stop when the PR is clean (CI green, no unaddressed comments, scope is tight) or when you hit a wall that needs human input. ## Inputs - **PR number** (required) - **Repo** (optional, defaults to current repo via `gh repo view --json nameWithOwner -q '.nameWithOwner'`) -- **Parent session key** (optional, for sending progress updates to the parent agent via `sessions_send`) +- **Parent session key** (optional, for sending progress updates to the parent agent via `send_to_task`) - **Max cycles** (optional, default 10. Each cycle = one CI wait + fix attempt) ## Spawning @@ -44,7 +44,7 @@ if [ -z "$LOCAL_DIR" ]; then LOCAL_DIR="/tmp/$REPO_DIR" fi -# Read project rules — these define lint commands, test runners, Python version, etc. +# Read project rules for F in CLAUDE.md AGENTS.md; do [ -f "$LOCAL_DIR/$F" ] && cat "$LOCAL_DIR/$F" done @@ -59,6 +59,40 @@ cd "$WORKTREE" git checkout -B "$BRANCH" "origin/$BRANCH" || true ``` +## Scope Check (runs once, before the loop) + +Before fixing anything, verify the PR's changes match its stated purpose. This catches accidental commits, formatting noise, and scope creep. + +```bash +# Get PR metadata +gh pr view $PR --repo $REPO --json title,body,commits --jq '{title: .title, body: .body, commits: [.commits[].messageHeadline]}' + +# Get changed files with stats +gh pr diff $PR --repo $REPO --stat + +# Get individual commit messages and their file lists +for SHA in $(gh api "repos/$REPO/pulls/$PR/commits" --jq '.[].sha'); do + echo "--- Commit ${SHA:0:8} ---" + gh api "repos/$REPO/commits/$SHA" --jq '.commit.message' + gh api "repos/$REPO/commits/$SHA" --jq '[.files[].filename] | join(", ")' +done +``` + +**Evaluate:** + +1. **File relevance:** Do all changed files relate to the PR title/description? Flag files that seem unrelated (e.g., a "fix login" PR that also reformats unrelated templates). +2. **Commit coherence:** Does each commit message align with the PR's purpose? Flag commits that introduce unrelated work. +3. **Formatting noise:** Flag bulk formatting changes (ruff, prettier, eslint --fix) applied beyond the files the PR actually needs to touch. +4. **Scope creep:** Multiple distinct features or fixes bundled into one PR. Each PR should do one thing. + +**If scope issues found:** + +Report them with specifics (which files, which commits) and classify: +- **MINOR**: A stray formatting commit or one unrelated file. Note it but continue babysitting. +- **MAJOR**: The PR bundles multiple unrelated changes, has bulk formatting noise, or commits that contradict the description. **ESCALATE.** Do not auto-fix. Report what should be split out or reverted. + +Include scope findings in every status report so the parent/user sees them. + ## The Loop Repeat up to `max_cycles` times: @@ -89,12 +123,11 @@ cd "$WORKTREE" gh api --paginate "repos/$REPO/pulls/$PR/comments" | \ jq '.[] | {author: .user.login, path: .path, line: .line, body: .body, commit: .original_commit_id, created: .created_at}' -# 2. Issue comments — the main comment thread. Automated reviewers (claude-review, Codex) post here. -# Read FULL body, not truncated. This is where most actionable feedback lives. +# 2. Issue comments (automated reviewers post here) gh api --paginate "repos/$REPO/issues/$PR/comments" | \ jq '.[] | {author: .user.login, body: .body, created: .created_at}' -# 3. Review verdicts — formal approve/request-changes. claude-review puts analysis in the body. +# 3. Review verdicts gh api --paginate "repos/$REPO/pulls/$PR/reviews" | \ jq '.[] | {author: .user.login, state: .state, body: .body}' @@ -151,23 +184,25 @@ After pushing (or deciding not to): - There are still issues you plan to address next cycle **Stop and report if:** -- PR is clean (CI green, no unaddressed comments) +- PR is clean (CI green, no unaddressed comments, scope is tight) - You hit max_cycles - All remaining issues need human input (escalate) - You pushed a fix for the same issue twice and it still fails (circuit breaker) +- Scope check found MAJOR issues (escalate immediately, don't try to fix) ## Reporting -Send progress updates to the parent agent via `sessions_send`. Do NOT try to send messages to Signal/Slack/etc directly (sub-agents don't have channel access). The parent agent handles delivery to the user. +Send progress updates to the parent agent via `send_to_task`. Do NOT try to send messages to Signal/Slack/etc directly (sub-agents don't have channel access). The parent agent handles delivery to the user. If a parent session key was provided, use: ``` -sessions_send(sessionKey="", message="") +send_to_task(sessionKey="", message="") ``` If no parent session key was provided, include status in your final output text (the auto-announce will deliver it). **When to report:** +- After scope check (always, even if clean) - After each fix push (brief: what was fixed) - When escalating (what needs human input and why) - When the PR is ready (final status) @@ -177,21 +212,34 @@ If no parent session key was provided, include status in your final output text ``` 🔧 PR #{number} ({repo}) — Cycle {N}/{max} +Scope: ✅ clean | ⚠️ minor (details) | 🚫 major (details) Fixed: Waiting: Needs attention: -Status: +Status: ``` **When PR is ready:** ``` ✅ PR #{number} ({repo}) — Ready to merge +Scope: ✅ changes match description CI: all green Reviews: all addressed Commits: {count} ``` +**When scope drift detected:** +``` +⚠️ PR #{number} ({repo}) — Scope Drift + +Description says: +Actually includes: +- +- +Recommendation: +``` + ## Cleanup When done (success or giving up): @@ -203,14 +251,15 @@ git -C "$LOCAL_DIR" worktree remove "$WORKTREE" --force 2>/dev/null ## Gotchas -- **Read CLAUDE.md/AGENTS.md first.** Every repo has different lint, test, and build commands. Never assume. Read the project's config files during Setup and use those commands throughout. -- **Stale review comments:** `original_commit_id` on inline comments refers to the commit when the comment was made. If HEAD has moved past it, the issue may already be fixed. Always check the current code before acting. +- **Read CLAUDE.md/AGENTS.md first.** Every repo has different lint, test, and build commands. Never assume. +- **Stale review comments:** `original_commit_id` on inline comments refers to the commit when the comment was made. If HEAD has moved past it, the issue may already be fixed. - **claude-review sticky comments:** These appear as issue comments from the `claude` user. They re-run on every push. Don't try to "fix" informational observations. -- **GitHub Actions GITHUB_TOKEN suppression:** Pushes from inside a GitHub Actions job using the default `GITHUB_TOKEN` don't trigger other workflow runs. This does NOT apply to local `gh` CLI pushes (which use your OAuth token). If CI doesn't start after your push from inside Actions, close+reopen the PR to kick off checks. -- **Worktree branch conflicts:** `git worktree add` fails if the branch is already checked out somewhere. The Setup uses `origin/$BRANCH` to avoid this, then creates a local tracking branch with `checkout -B`. +- **GitHub Actions GITHUB_TOKEN suppression:** Pushes from inside a GitHub Actions job using the default `GITHUB_TOKEN` don't trigger other workflow runs. This does NOT apply to local `gh` CLI pushes. +- **Worktree branch conflicts:** `git worktree add` fails if the branch is already checked out somewhere. The Setup uses `origin/$BRANCH` to avoid this. - **Sub-agents can introduce unintended refactors.** Always diff `$BRANCH` against `origin/$BRANCH` before pushing to confirm only the intended fix is included. - **Frontend lint may auto-fix unrelated files.** Run lint only on the files you changed, not the whole project. -- **Check ALL three comment sources.** `gh pr view --json reviews` only shows formal review submissions. Automated reviewers like claude-review often post as issue comments (`/issues/$PR/comments`), not formal reviews. Always check all three: inline review comments, issue comments, and review verdicts. +- **Check ALL three comment sources.** `gh pr view --json reviews` only shows formal review submissions. Automated reviewers often post as issue comments. +- **Scope check is not optional.** Even if the caller says "just fix CI," run the scope check. Catching drift early prevents wasted cycles fixing code that shouldn't be in the PR. ## Do NOT @@ -220,3 +269,4 @@ git -C "$LOCAL_DIR" worktree remove "$WORKTREE" --force 2>/dev/null - Make changes unrelated to the PR's purpose - Fix more than one issue per commit - Retry the same fix approach twice +- Auto-fix scope drift (always escalate it)