exiao · exiao · Apr 10, 2026 · Apr 10, 2026 · chatgpt-codex-connector · Apr 10, 2026
diff --git a/README.md b/README.md
@@ -124,6 +124,7 @@ Install the skills from https://github.com/exiao/skills
 | Skill | Description |
 |-------|-------------|
 | [agent-browser](agent-browser/) | Automate browsers via agent-browser CLI |
+| [babysit-open-prs](babysit-open-prs/) | Scan open PRs across tracked repos, triage, and spawn babysit-pr sub-agents |
 | [babysit-pr](babysit-pr/) | Monitor a PR through CI, reviews, and fixes until it's ready to merge |
 | [bloom-cli](bloom-cli/) | Fetch stock data, fundamentals, earnings, SEC filings via Bloom CLI |
 | [claude-md-management](claude-md-management/) | Audit, improve, and maintain CLAUDE.md files across repos |

diff --git a/babysit-open-prs/SKILL.md b/babysit-open-prs/SKILL.md
@@ -0,0 +1,106 @@
+---
+name: babysit-open-prs
+description: "Scan all open PRs across tracked repos, triage them, check for scope drift, and spawn babysit-pr sub-agents for fixable ones. Use when: babysit all PRs, check all open PRs, nightly PR review."
+---
+
+# Babysit Open PRs
+
+Scan open PRs across tracked repos, triage each one (scope check + CI + reviews), and spawn `babysit-pr` sub-agents for PRs that need fixing. Report results.
+
+## Step 1: Preflight
+
+Run the preflight script to discover PRs that need attention:
+
+```bash
+bash ~/clawd/scripts/pr-preflight.sh
+```
+
+This scans bloom-invest/bloom, bloom-invest/investing-log, exiao/skills, plus other repos under bloom-invest, prompt-pm, and exiao orgs with open PRs by exiao. It handles skip state and deduplication.
+
+If no output: no PRs need attention. Reply NO_REPLY.
+
+## Step 2: Triage (do this yourself, do NOT spawn sub-agents yet)
+
+For each PR in the preflight output, gather context:
+
+```bash
+# CI and merge status
+gh pr view <number> --repo <repo> --json title,body,statusCheckRollup,reviewDecision,headRefName,mergeable,commits
+
+# Check results
+gh pr checks <number> --repo <repo>
+
+# Commit count
+gh api "repos/<repo>/pulls/<number>/commits?per_page=100" --jq 'length'
+
+# Changed files (for scope check)
+gh pr diff <number> --repo <repo> --stat
+```
+
+### Scope Check (per PR)
+
+Compare the changed files and commit messages against the PR title and description:
+
+1. Do the files relate to the PR's stated purpose?
+2. Are there commits that introduce unrelated work?
+3. Is there bulk formatting noise beyond the PR's actual changes?
+4. Are multiple distinct features bundled together?
+
+### Classify each PR:
+
+- **CLEAN**: CI green, no unaddressed comments, scope is tight. No sub-agent needed.
+- **FIXABLE**: CI failure with identifiable root cause, or unaddressed review comments pointing to real bugs, or merge conflicts with clear resolution. Scope is acceptable. Spawn a sub-agent.
+- **SCOPE_DRIFT**: PR includes changes that don't match its description. Commits touch unrelated files, bundle multiple features, or include unnecessary formatting noise. Do NOT spawn a sub-agent. Report what's wrong and recommend how to fix (split PRs, revert commits, etc.).
+- **SKIP**: Merge conflicts needing design decisions, architectural issues, or draft/WIP PRs. Note for the report but do NOT spawn a sub-agent.
+
+## Step 3: Spawn sub-agents for FIXABLE PRs only
+
+For each fixable PR (max 5), spawn a sub-agent using the babysit-pr skill:
+
+```
+sessions_spawn({
+  task: "Use the babysit-pr skill. PR #<number>, repo <repo>. Max cycles: 5. Reasons flagged: <reasons>. Running via nightly cron (use pr-mark-skip.sh if escalating). Read the repo's CLAUDE.md/AGENTS.md first.",
+  cwd: "<local-path>",
+  run_timeout_seconds: 1800
+})
+```
+
+Local paths:
+- bloom-invest/bloom → ~/bloom
+- bloom-invest/investing-log → ~/clawd/investing-log
+- exiao/skills → ~/clawd/skills
+- Fintary/ops-center → ~/fintary/ops-center
+- Other repos → check ~/clawd/<repo> or ~/<repo>, clone to /tmp/<repo> if not found
+
+Wait for all sub-agents to complete.
+
+## Step 4: Report
+
+Do NOT send via the message tool. Output the summary as your reply. Cron delivery handles routing.
+
+**Format:**
+
+```
+🔧 Nightly PR Babysit — [date]
+
+[For each PR, one of:]
+
+✅ PR #N (repo): title — clean
+🔧 PR #N (repo): title — fixed (what was done)
+⚠️ PR #N (repo): title — scope drift
+   Description says: X
+   Actually includes: Y, Z
+   Recommendation: split/revert/remove
+🚫 PR #N (repo): title — blocked (why)
+⏭️ PR #N (repo): title — skipped (why)
+```
+
+If all PRs were already clean, keep it brief.
+
+## Gotchas
+
+- **Scope check is mandatory.** Every PR gets checked for drift, even if CI is green. A green CI on a bloated PR is still a problem.
+- **Don't fix scope drift.** Splitting PRs, reverting commits, or removing files from a PR requires human judgment on what belongs where. Always escalate.
+- **Max 5 sub-agents.** If more than 5 PRs are fixable, prioritize by: CI failures first, then review comments, then oldest.
+- **Preflight script handles skip state.** If a PR was previously marked as skip (via pr-mark-skip.sh), it won't appear in the preflight output.
+- **Sub-agents run babysit-pr which includes its own scope check.** The triage-level scope check here is a quick pass; babysit-pr does a deeper per-commit analysis.
diff --git a/babysit-pr/SKILL.md b/babysit-pr/SKILL.md
@@ -1,17 +1,17 @@
 ---
 name: babysit-pr
-description: "Monitor a PR until it's ready to merge. Watches CI, reads reviews, fixes issues, and repeats. Use when: babysit this PR, watch this PR, monitor PR, fix and watch PR, keep this PR green."
+description: "Monitor a PR until it's ready to merge. Watches CI, reads reviews, checks scope, fixes issues, and repeats. Use when: babysit this PR, watch this PR, monitor PR, fix and watch PR, keep this PR green."
 ---
 
 # Babysit PR
 
-Monitor a single PR through its full lifecycle: wait for CI, read reviews, fix issues, push, repeat. Stop when the PR is clean (CI green, no unaddressed comments) or when you hit a wall that needs human input.
+Monitor a single PR through its full lifecycle: check scope, wait for CI, read reviews, fix issues, push, repeat. Stop when the PR is clean (CI green, no unaddressed comments, scope is tight) or when you hit a wall that needs human input.
 
 ## Inputs
 
 - **PR number** (required)
 - **Repo** (optional, defaults to current repo via `gh repo view --json nameWithOwner -q '.nameWithOwner'`)
-- **Parent session key** (optional, for sending progress updates to the parent agent via `sessions_send`)
+- **Parent session key** (optional, for sending progress updates to the parent agent via `send_to_task`)
 - **Max cycles** (optional, default 10. Each cycle = one CI wait + fix attempt)
 
 ## Spawning
@@ -44,7 +44,7 @@ if [ -z "$LOCAL_DIR" ]; then
   LOCAL_DIR="/tmp/$REPO_DIR"
 fi
 
-# Read project rules — these define lint commands, test runners, Python version, etc.
+# Read project rules
 for F in CLAUDE.md AGENTS.md; do
   [ -f "$LOCAL_DIR/$F" ] && cat "$LOCAL_DIR/$F"
 done
@@ -59,6 +59,40 @@ cd "$WORKTREE"
 git checkout -B "$BRANCH" "origin/$BRANCH" || true
 ```
 
+## Scope Check (runs once, before the loop)
+
+Before fixing anything, verify the PR's changes match its stated purpose. This catches accidental commits, formatting noise, and scope creep.
+
+```bash
+# Get PR metadata
+gh pr view $PR --repo $REPO --json title,body,commits --jq '{title: .title, body: .body, commits: [.commits[].messageHeadline]}'
+
+# Get changed files with stats
+gh pr diff $PR --repo $REPO --stat
+
+# Get individual commit messages and their file lists
+for SHA in $(gh api "repos/$REPO/pulls/$PR/commits" --jq '.[].sha'); do
+  echo "--- Commit ${SHA:0:8} ---"
+  gh api "repos/$REPO/commits/$SHA" --jq '.commit.message'
+  gh api "repos/$REPO/commits/$SHA" --jq '[.files[].filename] | join(", ")'
+done
+```
+
+**Evaluate:**
+
+1. **File relevance:** Do all changed files relate to the PR title/description? Flag files that seem unrelated (e.g., a "fix login" PR that also reformats unrelated templates).
+2. **Commit coherence:** Does each commit message align with the PR's purpose? Flag commits that introduce unrelated work.
+3. **Formatting noise:** Flag bulk formatting changes (ruff, prettier, eslint --fix) applied beyond the files the PR actually needs to touch.
+4. **Scope creep:** Multiple distinct features or fixes bundled into one PR. Each PR should do one thing.
+
+**If scope issues found:**
+
+Report them with specifics (which files, which commits) and classify:
+- **MINOR**: A stray formatting commit or one unrelated file. Note it but continue babysitting.
+- **MAJOR**: The PR bundles multiple unrelated changes, has bulk formatting noise, or commits that contradict the description. **ESCALATE.** Do not auto-fix. Report what should be split out or reverted.
+
+Include scope findings in every status report so the parent/user sees them.
+
 ## The Loop
 
 Repeat up to `max_cycles` times:
@@ -89,12 +123,11 @@ cd "$WORKTREE"
 gh api --paginate "repos/$REPO/pulls/$PR/comments" | \
   jq '.[] | {author: .user.login, path: .path, line: .line, body: .body, commit: .original_commit_id, created: .created_at}'
 
-# 2. Issue comments — the main comment thread. Automated reviewers (claude-review, Codex) post here.
-#    Read FULL body, not truncated. This is where most actionable feedback lives.
+# 2. Issue comments (automated reviewers post here)
 gh api --paginate "repos/$REPO/issues/$PR/comments" | \
   jq '.[] | {author: .user.login, body: .body, created: .created_at}'
 
-# 3. Review verdicts — formal approve/request-changes. claude-review puts analysis in the body.
+# 3. Review verdicts
 gh api --paginate "repos/$REPO/pulls/$PR/reviews" | \
   jq '.[] | {author: .user.login, state: .state, body: .body}'
 
@@ -151,23 +184,25 @@ After pushing (or deciding not to):
 - There are still issues you plan to address next cycle
 
 **Stop and report if:**
-- PR is clean (CI green, no unaddressed comments)
+- PR is clean (CI green, no unaddressed comments, scope is tight)
 - You hit max_cycles
 - All remaining issues need human input (escalate)
 - You pushed a fix for the same issue twice and it still fails (circuit breaker)
+- Scope check found MAJOR issues (escalate immediately, don't try to fix)
 
 ## Reporting
 
-Send progress updates to the parent agent via `sessions_send`. Do NOT try to send messages to Signal/Slack/etc directly (sub-agents don't have channel access). The parent agent handles delivery to the user.
+Send progress updates to the parent agent via `send_to_task`. Do NOT try to send messages to Signal/Slack/etc directly (sub-agents don't have channel access). The parent agent handles delivery to the user.
 
 If a parent session key was provided, use:
 ```
-sessions_send(sessionKey="<parent_session_key>", message="<status update>")
+send_to_task(sessionKey="<parent_session_key>", message="<status update>")
 ```
 
 If no parent session key was provided, include status in your final output text (the auto-announce will deliver it).
 
 **When to report:**
+- After scope check (always, even if clean)
 - After each fix push (brief: what was fixed)
 - When escalating (what needs human input and why)
 - When the PR is ready (final status)
@@ -177,21 +212,34 @@ If no parent session key was provided, include status in your final output text
 ```
 🔧 PR #{number} ({repo}) — Cycle {N}/{max}
 
+Scope: ✅ clean | ⚠️ minor (details) | 🚫 major (details)
 Fixed: <what you fixed>
 Waiting: <what CI is running>
 Needs attention: <what you can't fix and why>
-Status: <monitoring | ready | blocked>
+Status: <monitoring | ready | blocked | scope-drift>
 ```
 
 **When PR is ready:**
 ```
 ✅ PR #{number} ({repo}) — Ready to merge
 
+Scope: ✅ changes match description
 CI: all green
 Reviews: all addressed
 Commits: {count}
 ```
 
+**When scope drift detected:**
+```
+⚠️ PR #{number} ({repo}) — Scope Drift
+
+Description says: <what PR claims to do>
+Actually includes:
+- <unrelated file/commit 1>
+- <unrelated file/commit 2>
+Recommendation: <split into N PRs | revert commits X,Y | remove files A,B>
+```
+
 ## Cleanup
 
 When done (success or giving up):
@@ -203,14 +251,15 @@ git -C "$LOCAL_DIR" worktree remove "$WORKTREE" --force 2>/dev/null
 
 ## Gotchas
 
-- **Read CLAUDE.md/AGENTS.md first.** Every repo has different lint, test, and build commands. Never assume. Read the project's config files during Setup and use those commands throughout.
-- **Stale review comments:** `original_commit_id` on inline comments refers to the commit when the comment was made. If HEAD has moved past it, the issue may already be fixed. Always check the current code before acting.
+- **Read CLAUDE.md/AGENTS.md first.** Every repo has different lint, test, and build commands. Never assume.
+- **Stale review comments:** `original_commit_id` on inline comments refers to the commit when the comment was made. If HEAD has moved past it, the issue may already be fixed.
 - **claude-review sticky comments:** These appear as issue comments from the `claude` user. They re-run on every push. Don't try to "fix" informational observations.
-- **GitHub Actions GITHUB_TOKEN suppression:** Pushes from inside a GitHub Actions job using the default `GITHUB_TOKEN` don't trigger other workflow runs. This does NOT apply to local `gh` CLI pushes (which use your OAuth token). If CI doesn't start after your push from inside Actions, close+reopen the PR to kick off checks.
-- **Worktree branch conflicts:** `git worktree add` fails if the branch is already checked out somewhere. The Setup uses `origin/$BRANCH` to avoid this, then creates a local tracking branch with `checkout -B`.
+- **GitHub Actions GITHUB_TOKEN suppression:** Pushes from inside a GitHub Actions job using the default `GITHUB_TOKEN` don't trigger other workflow runs. This does NOT apply to local `gh` CLI pushes.
+- **Worktree branch conflicts:** `git worktree add` fails if the branch is already checked out somewhere. The Setup uses `origin/$BRANCH` to avoid this.
 - **Sub-agents can introduce unintended refactors.** Always diff `$BRANCH` against `origin/$BRANCH` before pushing to confirm only the intended fix is included.
 - **Frontend lint may auto-fix unrelated files.** Run lint only on the files you changed, not the whole project.
-- **Check ALL three comment sources.** `gh pr view --json reviews` only shows formal review submissions. Automated reviewers like claude-review often post as issue comments (`/issues/$PR/comments`), not formal reviews. Always check all three: inline review comments, issue comments, and review verdicts.
+- **Check ALL three comment sources.** `gh pr view --json reviews` only shows formal review submissions. Automated reviewers often post as issue comments.
+- **Scope check is not optional.** Even if the caller says "just fix CI," run the scope check. Catching drift early prevents wasted cycles fixing code that shouldn't be in the PR.
 
 ## Do NOT
 
@@ -220,3 +269,4 @@ git -C "$LOCAL_DIR" worktree remove "$WORKTREE" --force 2>/dev/null
 - Make changes unrelated to the PR's purpose
 - Fix more than one issue per commit
 - Retry the same fix approach twice
+- Auto-fix scope drift (always escalate it)