BMB — Be My Butler

Multi-agent orchestration for Claude Code

Other AI coding tools optimize for speed. BMB optimizes for correctness.

Why BMB?

Solo AI coding assistants are fast — but they hallucinate, skip edge cases, and approve their own work. BMB fixes this by running multiple specialized agents that challenge, verify, and compress each other's output.

Problem	BMB's Solution
Self-review bias	Cross-model blind verification — a different model reviews without seeing the original reasoning
Design tunnel vision	Council debate with AI challengers arguing alternatives before a single line is written
Context explosion	3-layer compression protocol keeps token budgets tight across long pipelines
"Works for me" testing	Divergent framing — verifier receives a deliberately reworded spec to catch assumption leaks
Lost knowledge	FTS5 knowledge base + auto-learning promotes recurring lessons automatically

BMB doesn't replace your judgment — it gives you 5 specialized agents who challenge and verify each other's work before you decide.

Quickstart

Prerequisites: Claude Code CLI, tmux, python3, sqlite3, git

# 1. Install BMB
curl -fsSL https://raw.githubusercontent.com/project820/be-my-butler/main/install.sh | bash

# 2. Verify installation
bmb doctor

# 3. Run your first pipeline
#    Open Claude Code in any project and type:
/BMB

That's it. BMB registers its agents, skills, and scripts into your Claude Code environment. Type /BMB in any project to start the full 12-step pipeline.

The 12-Step Pipeline

Every /BMB run walks through these stages. Steps adapt based on the selected recipe — some steps are skipped or shortened for lighter workflows.

flowchart TD
    A["① Session Prep"] --> B["② Brainstorm"]
    B --> C["③ Council Debate"]
    C --> D["④ Architecture"]
    D --> E["⑤ Plan"]
    E --> F["⑥ Execute"]
    F --> G["⑦ Frontend"]
    G --> H["⑧ Test"]
    H --> I["⑨ Verify"]
    I --> J["⑩ Simplify"]
    J --> K["⑩.⑤ Analyst"]
    K --> L["⑪ Retrospective"]
    L --> M["⑫ Cleanup"]

    style A fill:#1a1a2e,stroke:#e94560,color:#fff
    style B fill:#1a1a2e,stroke:#e94560,color:#fff
    style C fill:#16213e,stroke:#0f3460,color:#fff
    style D fill:#16213e,stroke:#0f3460,color:#fff
    style E fill:#16213e,stroke:#0f3460,color:#fff
    style F fill:#0f3460,stroke:#53a8b6,color:#fff
    style G fill:#0f3460,stroke:#53a8b6,color:#fff
    style H fill:#0f3460,stroke:#53a8b6,color:#fff
    style I fill:#533483,stroke:#e94560,color:#fff
    style J fill:#533483,stroke:#e94560,color:#fff
    style K fill:#1a3a2e,stroke:#22c55e,color:#fff
    style L fill:#1a3a2e,stroke:#22c55e,color:#fff
    style M fill:#533483,stroke:#e94560,color:#fff

Step	Agent	What Happens
1	Lead	Session Prep — loads `session-prep.md`, restores context from prior sessions
2	Lead	Brainstorm — generates divergent ideas with blind framing
3	Lead	Council Debate — multi-round structured argument; Lead decides
4	Architect	Architecture — produces file tree, interface contracts, dependency map
5	Lead	Plan — converts architecture into ordered execution steps
6	Lead	Execute — implements changes in an isolated git worktree
7	Lead	Frontend — UI/UX work (skipped for backend-only recipes)
8	Tester	Test — writes and runs tests with coverage targets
9	Verifier	Verify — blind review with divergent spec framing
10	Lead	Simplify — removes dead code, flattens unnecessary abstractions
10.5	Analyst	Retrospective Analysis — queries `analytics.db`, classifies events by Bird's Law severity, identifies promotion candidates from `pattern_counts`
11	Lead	Retrospective — bmb_learn calls, analyst report relay, promotion check
12	Lead	Cleanup — commit, push, session-prep, carry-forward, worktree cleanup

Key Differentiators

Blind Verification The Verifier agent reviews code using a deliberately reworded specification. If the verifier finds issues the implementer missed, you know the solution has assumption leaks — not just bugs.	Council Debate Before any code is written, the Consultant and Lead engage in multi-round structured debate. The Consultant proposes alternatives, plays devil's advocate, and stress-tests assumptions. The Lead makes the final call — but only after hearing the opposition.
Worktree Isolation Each agent that writes code operates in its own git worktree. Parallel execution without merge conflicts. Changes are reviewed and merged only after verification passes.	3-Tier Auto-Learning Lessons flow upward: project-local learnings (per-repo) → global learnings (cross-project) → CLAUDE.md promotion (permanent rules). Recurring mistakes automatically become enforced rules.
3-Layer Context Compression Long pipelines bleed context. BMB compresses at three layers: intra-step (within each agent), inter-step (handoff summaries), and session-level (`session-prep.md` for continuity across conversations).	Configurable Recipes Not every task needs 12 steps. Pick a recipe to skip what you don't need — a bugfix skips brainstorm and council; a research task skips execution entirely.
Analytics Layer + Bird's Law Severity Every pipeline run emits structured telemetry to `analytics.db`. The Analyst (Step 10.5) queries `pattern_counts` to find recurring failures and classifies events by Bird's Law severity (critical / warn / info). Promotion candidates surface automatically after 2+ occurrences.	Context7 for Implementation The Architect agent queries live library documentation via Context7 MCP before writing code. No stale API assumptions — always write against the current SDK.

Recipes

Recipe	Steps Used	Best For
`feature`	All 12	New features, large changes
`bugfix`	1 → 5 → 6 → 8 → 9 → 10 → 11 → 12	Bug investigation and fix
`refactor`	1 → 4 → 5 → 6 → 8 → 9 → 10 → 11 → 12	Code restructuring
`research`	1 → 2 → 3 → 11 → 12	Exploration, spikes, design decisions
`review`	1 → 9 → 11 → 12	Code review only
`infra`	1 → 4 → 5 → 6 → 8 → 9 → 11 → 12	CI/CD, tooling, config changes

Slash Commands

Command	Description
`/BMB`	Full 12-step pipeline — select a recipe interactively
`/BMB-brainstorm`	Brainstorm + Council only — explore ideas without executing
`/BMB-refactoring`	Refactor recipe shortcut — skip brainstorm, go straight to architecture
`/BMB-setup`	First-time project setup — generates `session-prep.md` and config
`/BMB-status`	Project/idea dashboard — stale idea nudges, lifecycle overview

The 5 Agents

Agent	Role	Model
Architect	System design, file tree, contracts. Queries Context7 for live library docs.	Claude
Tester	Test writing and execution	Claude
Verifier	Blind review with divergent spec framing	Claude
Writer	Documentation generation	Claude
Analyst	Retrospective analytics: Bird's Law severity classification, `pattern_counts` promotion candidates	Claude (bypassPermissions, read-only)

The Lead agent orchestrates all pipeline steps that don't have a dedicated specialist agent.

Requirements

Dependency	Required	Notes
Claude Code CLI	Yes	Core runtime
`tmux`	Yes	Agent session management
`python3`	Yes	Script tooling
`sqlite3`	Yes	FTS5 knowledge base
`git`	Yes	Worktree isolation

Run bmb doctor after installation to verify all dependencies.

Interactive Architecture Guide

Explore the full pipeline visually:

View Interactive Docs →

Mobile-optimized summary pages (7-card vertical scroll, 4 locales):

Language	URL
English	m.html
한국어	m.ko.html
日本語	m.ja.html
繁體中文	m.zh-TW.html

Project Structure

~/Projects/bmb/              # Source of truth (GitHub repo)
├── skills/bmb*/             # 5 slash command skills
├── agents/bmb-*.md          # 5 agent definitions
├── bmb-system/
│   ├── config/              # defaults.json (v2)
│   ├── scripts/             # bmb-config.sh, bmb-ideas.sh, bmb-analytics.sh, bmb-learn.sh,
│   │                        # bmb-external-incidents.sh, knowledge-index.sh, knowledge-search.sh
│   └── plans/               # Version release plans
└── docs/                    # Architecture, configuration, troubleshooting

~/.claude/                   # Runtime (symlinks to repo)
├── skills/bmb* → repo       # Symlinked skills
├── agents/ → repo            # Symlinked agents
└── bmb-system/ → repo        # Symlinked runtime

.bmb/                        # Per-project runtime directory
├── config.json              # Project-local config (merged from 3 layers)
├── analytics/
│   └── analytics.db         # SQLite: sessions, events, pattern_counts
├── handoffs/
│   └── analyst-report.md    # Step 10.5 output
└── sessions/{id}/
    ├── carry-forward.md     # Atomic session continuity
    └── plan-review.md       # Cross-model plan critique

What's New in v0.4.0

6-Feature Upgrade — cross-model fix, agent discipline, visual brainstorming, session continuity, parallel sessions, and Monitor watchdog.

Capability	Description
OMX Cross-Model Fix	Replaced raw `codex exec` with MCP-disabled invocation. Eliminates 100% timeout rate caused by MCP server loading.
Superpowers Discipline	Verification gates, debugging discipline, TDD checklists, and YAGNI principles embedded directly in agent prompts. All agents upgraded to Opus 4.6 (1M context).
Visual Brainstorming	Browser-based visual companion for Step 2 — mockups, architecture diagrams, trade-off matrices via Superpowers server.
Session-End Prep	Step 12 auto-generates `next-session-plan.md` with completed items, follow-ups, and a one-line start prompt.
Parallel Sessions	`SESSION_MODE` enum (standalone/sub/consolidation) for safe concurrent pipelines with track splitting and consolidation prompts.
Monitor Watchdog	Haiku Monitor enhanced with pane sweep for orphaned processes and nudge escalation for stalled agents.

Contributing

Contributions are welcome. Please read the Contributing Guide before submitting a PR.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Run the test suite (bmb doctor && /BMB-setup)
Commit your changes
Open a Pull Request

License

MIT — use it however you want.

Built with obstinate attention to correctness.

Report Bug · Request Feature · Discussions

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.claude-plugin		.claude-plugin
.github		.github
agents		agents
bmb-system		bmb-system
docs		docs
skills		skills
tests		tests
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
WHATS-NEW-0.2.md		WHATS-NEW-0.2.md
WHATS-NEW-0.3.5.md		WHATS-NEW-0.3.5.md
WHATS-NEW-0.3.md		WHATS-NEW-0.3.md
WHATS-NEW-0.4.md		WHATS-NEW-0.4.md
doctor.sh		doctor.sh
install.sh		install.sh
uninstall.sh		uninstall.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BMB — Be My Butler

Why BMB?

Quickstart

The 12-Step Pipeline

Key Differentiators

Blind Verification

Council Debate

Worktree Isolation

3-Tier Auto-Learning

3-Layer Context Compression

Configurable Recipes

Analytics Layer + Bird's Law Severity

Context7 for Implementation

Recipes

Slash Commands

The 5 Agents

Requirements

Interactive Architecture Guide

Project Structure

What's New in v0.4.0

Contributing

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BMB — Be My Butler

Why BMB?

Quickstart

The 12-Step Pipeline

Key Differentiators

Blind Verification

Council Debate

Worktree Isolation

3-Tier Auto-Learning

3-Layer Context Compression

Configurable Recipes

Analytics Layer + Bird's Law Severity

Context7 for Implementation

Recipes

Slash Commands

The 5 Agents

Requirements

Interactive Architecture Guide

Project Structure

What's New in v0.4.0

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages