Skip to content

sharifli4/crucible

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crucible

A Claude Code plugin that pits multiple AI agents against each other in adversarial debate. Agents independently propose solutions, self-critique, then attack and defend across adaptive rounds until they converge on the strongest answer. An arbiter scores every position and synthesizes the final result.

Crucible Demo

How It Works

                    ┌─────────┐   ┌─────────┐   ┌─────────┐
                    │ Agent 1 │   │ Agent 2 │   │ Agent N │
                    └────┬────┘   └────┬────┘   └────┬────┘
                         │             │              │
  Round 1                ▼             ▼              ▼
  Propose           ┌─────────────────────────────────────┐
                    │   Each agent proposes independently  │
                    └──────────────────┬──────────────────┘
                                       │
  Round 1.5                            ▼
  Self-Critique     ┌─────────────────────────────────────┐
                    │  Each agent attacks its own proposal │
                    │  and produces a hardened position    │
                    └──────────────────┬──────────────────┘
                                       │
                    ┌──────────────────────────────────────────┐
                    │            ADAPTIVE  LOOP                │
                    │                                          │
  Round K           │  ┌────────────────────────────────────┐  │
  Cross-Attack      │  │  Every agent attacks every other   │  │
                    │  │  N×(N-1) critiques in parallel     │  │
                    │  └─────────────────┬──────────────────┘  │
                    │                    │                      │
  Defend            │  ┌─────────────────▼──────────────────┐  │
                    │  │  Every agent defends and refines   │  │
                    │  └─────────────────┬──────────────────┘  │
                    │                    │                      │
  Converge?         │  ┌─────────────────▼──────────────────┐  │
                    │  │  CONVERGED ──────────► exit loop   │  │
                    │  │  DIVERGED ───► focus areas ► loop  │  │
                    │  │  Max rounds ─────────► exit loop   │  │
                    │  └────────────────────────────────────┘  │
                    └──────────────────┬───────────────────────┘
                                       │
                                       ▼
                    ┌─────────────────────────────────────┐
                    │              ARBITER                 │
                    │                                     │
                    │  • Reads full transcript             │
                    │  • Scores all agents (/60)           │
                    │  • May send back for 1 more round    │
                    │  • Synthesizes final answer           │
                    └─────────────────────────────────────┘

By default, agents alternate between Opus and Sonnet so the debate has genuine model diversity — different models have different reasoning patterns and blind spots. Use --model to force all agents to the same model. The arbiter always uses Opus.

Install

In Claude Code, run:

/plugin marketplace add sharifli4/crucible
/plugin install crucible@crucible-marketplace

Requires Claude Code with a valid Anthropic API key.

Usage

/crucible [--agents N] [--model sonnet|opus] [--rounds N] <your task>
Flag Default Description
--agents 2 Number of debaters (2-5)
--model mixed Force all debaters to one model (sonnet, opus, or haiku). Default: agents alternate between opus and sonnet for model diversity.
--rounds 5 Max cross-attack rounds (2-5, exits early on convergence)

Cost note: Each cross-attack round runs N×(N-1) critiques in parallel. With 2 agents that's 2 critiques/round; with 5 agents it's 20 critiques/round. A full 5-agent, 5-round run can exceed 80 agent calls. Start with 2-3 agents for most tasks.

/crucible implement a thread-safe LRU cache in Python
/crucible --agents 3 should we use event sourcing or CRUD for a high-write system?
/crucible --agents 3 --model opus design a distributed consensus algorithm

Structure

crucible/
├── .claude-plugin/
│   ├── plugin.json
│   └── marketplace.json
├── commands/crucible.md
├── agents/
│   ├── debater.md
│   └── arbiter.md
└── README.md

License

MIT

About

A Claude Code plugin that runs two AI agents against each other: they independently propose solutions, directly attack each other's arguments, and defend their own positions across multiple debate rounds until they converge on the most correct answer.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors