Skip to content

project820/be-my-butler

Repository files navigation

[EN] | 한국어 | 日本語 | 繁體中文

BMB — Be My Butler

Multi-agent orchestration for Claude Code

Version License: MIT PRs Welcome Claude Code Agents Steps What's New

Other AI coding tools optimize for speed. BMB optimizes for correctness.


Why BMB?

Solo AI coding assistants are fast — but they hallucinate, skip edge cases, and approve their own work. BMB fixes this by running multiple specialized agents that challenge, verify, and compress each other's output.

Problem BMB's Solution
Self-review bias Cross-model blind verification — a different model reviews without seeing the original reasoning
Design tunnel vision Council debate with AI challengers arguing alternatives before a single line is written
Context explosion 3-layer compression protocol keeps token budgets tight across long pipelines
"Works for me" testing Divergent framing — verifier receives a deliberately reworded spec to catch assumption leaks
Lost knowledge FTS5 knowledge base + auto-learning promotes recurring lessons automatically

BMB doesn't replace your judgment — it gives you 5 specialized agents who challenge and verify each other's work before you decide.


Quickstart

Prerequisites: Claude Code CLI, tmux, python3, sqlite3, git

# 1. Install BMB
curl -fsSL https://raw.githubusercontent.com/project820/be-my-butler/main/install.sh | bash

# 2. Verify installation
bmb doctor

# 3. Run your first pipeline
#    Open Claude Code in any project and type:
/BMB

That's it. BMB registers its agents, skills, and scripts into your Claude Code environment. Type /BMB in any project to start the full 12-step pipeline.


The 12-Step Pipeline

Every /BMB run walks through these stages. Steps adapt based on the selected recipe — some steps are skipped or shortened for lighter workflows.

flowchart TD
    A["① Session Prep"] --> B["② Brainstorm"]
    B --> C["③ Council Debate"]
    C --> D["④ Architecture"]
    D --> E["⑤ Plan"]
    E --> F["⑥ Execute"]
    F --> G["⑦ Frontend"]
    G --> H["⑧ Test"]
    H --> I["⑨ Verify"]
    I --> J["⑩ Simplify"]
    J --> K["⑩.⑤ Analyst"]
    K --> L["⑪ Retrospective"]
    L --> M["⑫ Cleanup"]

    style A fill:#1a1a2e,stroke:#e94560,color:#fff
    style B fill:#1a1a2e,stroke:#e94560,color:#fff
    style C fill:#16213e,stroke:#0f3460,color:#fff
    style D fill:#16213e,stroke:#0f3460,color:#fff
    style E fill:#16213e,stroke:#0f3460,color:#fff
    style F fill:#0f3460,stroke:#53a8b6,color:#fff
    style G fill:#0f3460,stroke:#53a8b6,color:#fff
    style H fill:#0f3460,stroke:#53a8b6,color:#fff
    style I fill:#533483,stroke:#e94560,color:#fff
    style J fill:#533483,stroke:#e94560,color:#fff
    style K fill:#1a3a2e,stroke:#22c55e,color:#fff
    style L fill:#1a3a2e,stroke:#22c55e,color:#fff
    style M fill:#533483,stroke:#e94560,color:#fff
Loading
Step Agent What Happens
1 Lead Session Prep — loads session-prep.md, restores context from prior sessions
2 Lead Brainstorm — generates divergent ideas with blind framing
3 Lead Council Debate — multi-round structured argument; Lead decides
4 Architect Architecture — produces file tree, interface contracts, dependency map
5 Lead Plan — converts architecture into ordered execution steps
6 Lead Execute — implements changes in an isolated git worktree
7 Lead Frontend — UI/UX work (skipped for backend-only recipes)
8 Tester Test — writes and runs tests with coverage targets
9 Verifier Verify — blind review with divergent spec framing
10 Lead Simplify — removes dead code, flattens unnecessary abstractions
10.5 Analyst Retrospective Analysis — queries analytics.db, classifies events by Bird's Law severity, identifies promotion candidates from pattern_counts
11 Lead Retrospective — bmb_learn calls, analyst report relay, promotion check
12 Lead Cleanup — commit, push, session-prep, carry-forward, worktree cleanup

Key Differentiators

Blind Verification

The Verifier agent reviews code using a deliberately reworded specification. If the verifier finds issues the implementer missed, you know the solution has assumption leaks — not just bugs.

Council Debate

Before any code is written, the Consultant and Lead engage in multi-round structured debate. The Consultant proposes alternatives, plays devil's advocate, and stress-tests assumptions. The Lead makes the final call — but only after hearing the opposition.

Worktree Isolation

Each agent that writes code operates in its own git worktree. Parallel execution without merge conflicts. Changes are reviewed and merged only after verification passes.

3-Tier Auto-Learning

Lessons flow upward: project-local learnings (per-repo) → global learnings (cross-project) → CLAUDE.md promotion (permanent rules). Recurring mistakes automatically become enforced rules.

3-Layer Context Compression

Long pipelines bleed context. BMB compresses at three layers: intra-step (within each agent), inter-step (handoff summaries), and session-level (session-prep.md for continuity across conversations).

Configurable Recipes

Not every task needs 12 steps. Pick a recipe to skip what you don't need — a bugfix skips brainstorm and council; a research task skips execution entirely.

Analytics Layer + Bird's Law Severity

Every pipeline run emits structured telemetry to analytics.db. The Analyst (Step 10.5) queries pattern_counts to find recurring failures and classifies events by Bird's Law severity (critical / warn / info). Promotion candidates surface automatically after 2+ occurrences.

Context7 for Implementation

The Architect agent queries live library documentation via Context7 MCP before writing code. No stale API assumptions — always write against the current SDK.


Recipes

Recipe Steps Used Best For
feature All 12 New features, large changes
bugfix 1 → 5 → 6 → 8 → 9 → 10 → 11 → 12 Bug investigation and fix
refactor 1 → 4 → 5 → 6 → 8 → 9 → 10 → 11 → 12 Code restructuring
research 1 → 2 → 3 → 11 → 12 Exploration, spikes, design decisions
review 1 → 9 → 11 → 12 Code review only
infra 1 → 4 → 5 → 6 → 8 → 9 → 11 → 12 CI/CD, tooling, config changes

Slash Commands

Command Description
/BMB Full 12-step pipeline — select a recipe interactively
/BMB-brainstorm Brainstorm + Council only — explore ideas without executing
/BMB-refactoring Refactor recipe shortcut — skip brainstorm, go straight to architecture
/BMB-setup First-time project setup — generates session-prep.md and config
/BMB-status Project/idea dashboard — stale idea nudges, lifecycle overview

The 5 Agents

Agent Role Model
Architect System design, file tree, contracts. Queries Context7 for live library docs. Claude
Tester Test writing and execution Claude
Verifier Blind review with divergent spec framing Claude
Writer Documentation generation Claude
Analyst Retrospective analytics: Bird's Law severity classification, pattern_counts promotion candidates Claude (bypassPermissions, read-only)

The Lead agent orchestrates all pipeline steps that don't have a dedicated specialist agent.


Requirements

Dependency Required Notes
Claude Code CLI Yes Core runtime
tmux Yes Agent session management
python3 Yes Script tooling
sqlite3 Yes FTS5 knowledge base
git Yes Worktree isolation

Run bmb doctor after installation to verify all dependencies.


Interactive Architecture Guide

Explore the full pipeline visually:

View Interactive Docs →

Mobile-optimized summary pages (7-card vertical scroll, 4 locales):

Language URL
English m.html
한국어 m.ko.html
日本語 m.ja.html
繁體中文 m.zh-TW.html

Project Structure

~/Projects/bmb/              # Source of truth (GitHub repo)
├── skills/bmb*/             # 5 slash command skills
├── agents/bmb-*.md          # 5 agent definitions
├── bmb-system/
│   ├── config/              # defaults.json (v2)
│   ├── scripts/             # bmb-config.sh, bmb-ideas.sh, bmb-analytics.sh, bmb-learn.sh,
│   │                        # bmb-external-incidents.sh, knowledge-index.sh, knowledge-search.sh
│   └── plans/               # Version release plans
└── docs/                    # Architecture, configuration, troubleshooting

~/.claude/                   # Runtime (symlinks to repo)
├── skills/bmb* → repo       # Symlinked skills
├── agents/ → repo            # Symlinked agents
└── bmb-system/ → repo        # Symlinked runtime

.bmb/                        # Per-project runtime directory
├── config.json              # Project-local config (merged from 3 layers)
├── analytics/
│   └── analytics.db         # SQLite: sessions, events, pattern_counts
├── handoffs/
│   └── analyst-report.md    # Step 10.5 output
└── sessions/{id}/
    ├── carry-forward.md     # Atomic session continuity
    └── plan-review.md       # Cross-model plan critique

What's New in v0.4.0

6-Feature Upgrade — cross-model fix, agent discipline, visual brainstorming, session continuity, parallel sessions, and Monitor watchdog.

Capability Description
OMX Cross-Model Fix Replaced raw codex exec with MCP-disabled invocation. Eliminates 100% timeout rate caused by MCP server loading.
Superpowers Discipline Verification gates, debugging discipline, TDD checklists, and YAGNI principles embedded directly in agent prompts. All agents upgraded to Opus 4.6 (1M context).
Visual Brainstorming Browser-based visual companion for Step 2 — mockups, architecture diagrams, trade-off matrices via Superpowers server.
Session-End Prep Step 12 auto-generates next-session-plan.md with completed items, follow-ups, and a one-line start prompt.
Parallel Sessions SESSION_MODE enum (standalone/sub/consolidation) for safe concurrent pipelines with track splitting and consolidation prompts.
Monitor Watchdog Haiku Monitor enhanced with pane sweep for orphaned processes and nudge escalation for stalled agents.

Contributing

Contributions are welcome. Please read the Contributing Guide before submitting a PR.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Run the test suite (bmb doctor && /BMB-setup)
  4. Commit your changes
  5. Open a Pull Request

License

MIT — use it however you want.


Built with obstinate attention to correctness.

Report Bug · Request Feature · Discussions