Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ Install the skills from https://github.com/exiao/skills
| Skill | Description |
|-------|-------------|
| [agent-browser](agent-browser/) | Automate browsers via agent-browser CLI |
| [babysit-open-prs](babysit-open-prs/) | Scan open PRs across tracked repos, triage, and spawn babysit-pr sub-agents |
| [babysit-pr](babysit-pr/) | Monitor a PR through CI, reviews, and fixes until it's ready to merge |
| [bloom-cli](bloom-cli/) | Fetch stock data, fundamentals, earnings, SEC filings via Bloom CLI |
| [claude-md-management](claude-md-management/) | Audit, improve, and maintain CLAUDE.md files across repos |
Expand Down
106 changes: 106 additions & 0 deletions babysit-open-prs/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
name: babysit-open-prs
description: "Scan all open PRs across tracked repos, triage them, check for scope drift, and spawn babysit-pr sub-agents for fixable ones. Use when: babysit all PRs, check all open PRs, nightly PR review."
Comment on lines +1 to +3
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add babysit-open-prs to README index

This commit introduces a new root skill directory (babysit-open-prs/) but does not add it to README.md, which violates the repo convention that every skill must be listed for discovery/routing. Leaving the index stale can break contributor workflows and CI checks that validate README coverage when skills are added.

Useful? React with 👍 / 👎.

---

# Babysit Open PRs

Scan open PRs across tracked repos, triage each one (scope check + CI + reviews), and spawn `babysit-pr` sub-agents for PRs that need fixing. Report results.

## Step 1: Preflight

Run the preflight script to discover PRs that need attention:

```bash
bash ~/clawd/scripts/pr-preflight.sh
```

This scans bloom-invest/bloom, bloom-invest/investing-log, exiao/skills, plus other repos under bloom-invest, prompt-pm, and exiao orgs with open PRs by exiao. It handles skip state and deduplication.

If no output: no PRs need attention. Reply NO_REPLY.

## Step 2: Triage (do this yourself, do NOT spawn sub-agents yet)

For each PR in the preflight output, gather context:

```bash
# CI and merge status
gh pr view <number> --repo <repo> --json title,body,statusCheckRollup,reviewDecision,headRefName,mergeable,commits

# Check results
gh pr checks <number> --repo <repo>

# Commit count
gh api "repos/<repo>/pulls/<number>/commits?per_page=100" --jq 'length'

# Changed files (for scope check)
gh pr diff <number> --repo <repo> --stat
```

### Scope Check (per PR)

Compare the changed files and commit messages against the PR title and description:

1. Do the files relate to the PR's stated purpose?
2. Are there commits that introduce unrelated work?
3. Is there bulk formatting noise beyond the PR's actual changes?
4. Are multiple distinct features bundled together?

### Classify each PR:

- **CLEAN**: CI green, no unaddressed comments, scope is tight. No sub-agent needed.
- **FIXABLE**: CI failure with identifiable root cause, or unaddressed review comments pointing to real bugs, or merge conflicts with clear resolution. Scope is acceptable. Spawn a sub-agent.
- **SCOPE_DRIFT**: PR includes changes that don't match its description. Commits touch unrelated files, bundle multiple features, or include unnecessary formatting noise. Do NOT spawn a sub-agent. Report what's wrong and recommend how to fix (split PRs, revert commits, etc.).
- **SKIP**: Merge conflicts needing design decisions, architectural issues, or draft/WIP PRs. Note for the report but do NOT spawn a sub-agent.

## Step 3: Spawn sub-agents for FIXABLE PRs only

For each fixable PR (max 5), spawn a sub-agent using the babysit-pr skill:

```
sessions_spawn({
task: "Use the babysit-pr skill. PR #<number>, repo <repo>. Max cycles: 5. Reasons flagged: <reasons>. Running via nightly cron (use pr-mark-skip.sh if escalating). Read the repo's CLAUDE.md/AGENTS.md first.",
cwd: "<local-path>",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runTimeoutSeconds here vs run_timeout_seconds in babysit-pr/SKILL.md:25. Pick one convention across both skills.

run_timeout_seconds: 1800
})
```

Local paths:
- bloom-invest/bloom → ~/bloom
- bloom-invest/investing-log → ~/clawd/investing-log
- exiao/skills → ~/clawd/skills
- Fintary/ops-center → ~/fintary/ops-center
- Other repos → check ~/clawd/<repo> or ~/<repo>, clone to /tmp/<repo> if not found

Wait for all sub-agents to complete.

## Step 4: Report

Do NOT send via the message tool. Output the summary as your reply. Cron delivery handles routing.

**Format:**

```
🔧 Nightly PR Babysit — [date]

[For each PR, one of:]

✅ PR #N (repo): title — clean
🔧 PR #N (repo): title — fixed (what was done)
⚠️ PR #N (repo): title — scope drift
Description says: X
Actually includes: Y, Z
Recommendation: split/revert/remove
🚫 PR #N (repo): title — blocked (why)
⏭️ PR #N (repo): title — skipped (why)
```

If all PRs were already clean, keep it brief.

## Gotchas

- **Scope check is mandatory.** Every PR gets checked for drift, even if CI is green. A green CI on a bloated PR is still a problem.
- **Don't fix scope drift.** Splitting PRs, reverting commits, or removing files from a PR requires human judgment on what belongs where. Always escalate.
- **Max 5 sub-agents.** If more than 5 PRs are fixable, prioritize by: CI failures first, then review comments, then oldest.
- **Preflight script handles skip state.** If a PR was previously marked as skip (via pr-mark-skip.sh), it won't appear in the preflight output.
- **Sub-agents run babysit-pr which includes its own scope check.** The triage-level scope check here is a quick pass; babysit-pr does a deeper per-commit analysis.
82 changes: 66 additions & 16 deletions babysit-pr/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
---
name: babysit-pr
description: "Monitor a PR until it's ready to merge. Watches CI, reads reviews, fixes issues, and repeats. Use when: babysit this PR, watch this PR, monitor PR, fix and watch PR, keep this PR green."
description: "Monitor a PR until it's ready to merge. Watches CI, reads reviews, checks scope, fixes issues, and repeats. Use when: babysit this PR, watch this PR, monitor PR, fix and watch PR, keep this PR green."
---

# Babysit PR

Monitor a single PR through its full lifecycle: wait for CI, read reviews, fix issues, push, repeat. Stop when the PR is clean (CI green, no unaddressed comments) or when you hit a wall that needs human input.
Monitor a single PR through its full lifecycle: check scope, wait for CI, read reviews, fix issues, push, repeat. Stop when the PR is clean (CI green, no unaddressed comments, scope is tight) or when you hit a wall that needs human input.

## Inputs

- **PR number** (required)
- **Repo** (optional, defaults to current repo via `gh repo view --json nameWithOwner -q '.nameWithOwner'`)
- **Parent session key** (optional, for sending progress updates to the parent agent via `sessions_send`)
- **Parent session key** (optional, for sending progress updates to the parent agent via `send_to_task`)
- **Max cycles** (optional, default 10. Each cycle = one CI wait + fix attempt)

## Spawning
Expand Down Expand Up @@ -44,7 +44,7 @@ if [ -z "$LOCAL_DIR" ]; then
LOCAL_DIR="/tmp/$REPO_DIR"
fi

# Read project rules — these define lint commands, test runners, Python version, etc.
# Read project rules
for F in CLAUDE.md AGENTS.md; do
[ -f "$LOCAL_DIR/$F" ] && cat "$LOCAL_DIR/$F"
done
Expand All @@ -59,6 +59,40 @@ cd "$WORKTREE"
git checkout -B "$BRANCH" "origin/$BRANCH" || true
```

## Scope Check (runs once, before the loop)

Before fixing anything, verify the PR's changes match its stated purpose. This catches accidental commits, formatting noise, and scope creep.

```bash
# Get PR metadata
gh pr view $PR --repo $REPO --json title,body,commits --jq '{title: .title, body: .body, commits: [.commits[].messageHeadline]}'

# Get changed files with stats
gh pr diff $PR --repo $REPO --stat

# Get individual commit messages and their file lists
for SHA in $(gh api "repos/$REPO/pulls/$PR/commits" --jq '.[].sha'); do
echo "--- Commit ${SHA:0:8} ---"
gh api "repos/$REPO/commits/$SHA" --jq '.commit.message'
gh api "repos/$REPO/commits/$SHA" --jq '[.files[].filename] | join(", ")'
done
```

**Evaluate:**

1. **File relevance:** Do all changed files relate to the PR title/description? Flag files that seem unrelated (e.g., a "fix login" PR that also reformats unrelated templates).
2. **Commit coherence:** Does each commit message align with the PR's purpose? Flag commits that introduce unrelated work.
3. **Formatting noise:** Flag bulk formatting changes (ruff, prettier, eslint --fix) applied beyond the files the PR actually needs to touch.
4. **Scope creep:** Multiple distinct features or fixes bundled into one PR. Each PR should do one thing.

**If scope issues found:**

Report them with specifics (which files, which commits) and classify:
- **MINOR**: A stray formatting commit or one unrelated file. Note it but continue babysitting.
- **MAJOR**: The PR bundles multiple unrelated changes, has bulk formatting noise, or commits that contradict the description. **ESCALATE.** Do not auto-fix. Report what should be split out or reverted.

Include scope findings in every status report so the parent/user sees them.

## The Loop

Repeat up to `max_cycles` times:
Expand Down Expand Up @@ -89,12 +123,11 @@ cd "$WORKTREE"
gh api --paginate "repos/$REPO/pulls/$PR/comments" | \
jq '.[] | {author: .user.login, path: .path, line: .line, body: .body, commit: .original_commit_id, created: .created_at}'

# 2. Issue comments — the main comment thread. Automated reviewers (claude-review, Codex) post here.
# Read FULL body, not truncated. This is where most actionable feedback lives.
# 2. Issue comments (automated reviewers post here)
gh api --paginate "repos/$REPO/issues/$PR/comments" | \
jq '.[] | {author: .user.login, body: .body, created: .created_at}'

# 3. Review verdicts — formal approve/request-changes. claude-review puts analysis in the body.
# 3. Review verdicts
gh api --paginate "repos/$REPO/pulls/$PR/reviews" | \
jq '.[] | {author: .user.login, state: .state, body: .body}'

Expand Down Expand Up @@ -151,23 +184,25 @@ After pushing (or deciding not to):
- There are still issues you plan to address next cycle

**Stop and report if:**
- PR is clean (CI green, no unaddressed comments)
- PR is clean (CI green, no unaddressed comments, scope is tight)
- You hit max_cycles
- All remaining issues need human input (escalate)
- You pushed a fix for the same issue twice and it still fails (circuit breaker)
- Scope check found MAJOR issues (escalate immediately, don't try to fix)

## Reporting

Send progress updates to the parent agent via `sessions_send`. Do NOT try to send messages to Signal/Slack/etc directly (sub-agents don't have channel access). The parent agent handles delivery to the user.
Send progress updates to the parent agent via `send_to_task`. Do NOT try to send messages to Signal/Slack/etc directly (sub-agents don't have channel access). The parent agent handles delivery to the user.

If a parent session key was provided, use:
```
sessions_send(sessionKey="<parent_session_key>", message="<status update>")
send_to_task(sessionKey="<parent_session_key>", message="<status update>")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Standardize parent update call naming

The reporting instructions now show send_to_task(...), but this skill still states progress is sent via sessions_send, creating conflicting guidance in one section. That ambiguity can cause sub-agents to use the wrong API and silently miss parent status updates; the doc should use one API name consistently.

Useful? React with 👍 / 👎.

```

If no parent session key was provided, include status in your final output text (the auto-announce will deliver it).

**When to report:**
- After scope check (always, even if clean)
- After each fix push (brief: what was fixed)
- When escalating (what needs human input and why)
- When the PR is ready (final status)
Expand All @@ -177,21 +212,34 @@ If no parent session key was provided, include status in your final output text
```
🔧 PR #{number} ({repo}) — Cycle {N}/{max}

Scope: ✅ clean | ⚠️ minor (details) | 🚫 major (details)
Fixed: <what you fixed>
Waiting: <what CI is running>
Needs attention: <what you can't fix and why>
Status: <monitoring | ready | blocked>
Status: <monitoring | ready | blocked | scope-drift>
```

**When PR is ready:**
```
✅ PR #{number} ({repo}) — Ready to merge

Scope: ✅ changes match description
CI: all green
Reviews: all addressed
Commits: {count}
```

**When scope drift detected:**
```
⚠️ PR #{number} ({repo}) — Scope Drift

Description says: <what PR claims to do>
Actually includes:
- <unrelated file/commit 1>
- <unrelated file/commit 2>
Recommendation: <split into N PRs | revert commits X,Y | remove files A,B>
```

## Cleanup

When done (success or giving up):
Expand All @@ -203,14 +251,15 @@ git -C "$LOCAL_DIR" worktree remove "$WORKTREE" --force 2>/dev/null

## Gotchas

- **Read CLAUDE.md/AGENTS.md first.** Every repo has different lint, test, and build commands. Never assume. Read the project's config files during Setup and use those commands throughout.
- **Stale review comments:** `original_commit_id` on inline comments refers to the commit when the comment was made. If HEAD has moved past it, the issue may already be fixed. Always check the current code before acting.
- **Read CLAUDE.md/AGENTS.md first.** Every repo has different lint, test, and build commands. Never assume.
- **Stale review comments:** `original_commit_id` on inline comments refers to the commit when the comment was made. If HEAD has moved past it, the issue may already be fixed.
- **claude-review sticky comments:** These appear as issue comments from the `claude` user. They re-run on every push. Don't try to "fix" informational observations.
- **GitHub Actions GITHUB_TOKEN suppression:** Pushes from inside a GitHub Actions job using the default `GITHUB_TOKEN` don't trigger other workflow runs. This does NOT apply to local `gh` CLI pushes (which use your OAuth token). If CI doesn't start after your push from inside Actions, close+reopen the PR to kick off checks.
- **Worktree branch conflicts:** `git worktree add` fails if the branch is already checked out somewhere. The Setup uses `origin/$BRANCH` to avoid this, then creates a local tracking branch with `checkout -B`.
- **GitHub Actions GITHUB_TOKEN suppression:** Pushes from inside a GitHub Actions job using the default `GITHUB_TOKEN` don't trigger other workflow runs. This does NOT apply to local `gh` CLI pushes.
- **Worktree branch conflicts:** `git worktree add` fails if the branch is already checked out somewhere. The Setup uses `origin/$BRANCH` to avoid this.
- **Sub-agents can introduce unintended refactors.** Always diff `$BRANCH` against `origin/$BRANCH` before pushing to confirm only the intended fix is included.
- **Frontend lint may auto-fix unrelated files.** Run lint only on the files you changed, not the whole project.
- **Check ALL three comment sources.** `gh pr view --json reviews` only shows formal review submissions. Automated reviewers like claude-review often post as issue comments (`/issues/$PR/comments`), not formal reviews. Always check all three: inline review comments, issue comments, and review verdicts.
- **Check ALL three comment sources.** `gh pr view --json reviews` only shows formal review submissions. Automated reviewers often post as issue comments.
- **Scope check is not optional.** Even if the caller says "just fix CI," run the scope check. Catching drift early prevents wasted cycles fixing code that shouldn't be in the PR.

## Do NOT

Expand All @@ -220,3 +269,4 @@ git -C "$LOCAL_DIR" worktree remove "$WORKTREE" --force 2>/dev/null
- Make changes unrelated to the PR's purpose
- Fix more than one issue per commit
- Retry the same fix approach twice
- Auto-fix scope drift (always escalate it)
Loading