claude-council

A Claude Code plugin that consults multiple AI coding agents to get diverse perspectives on coding problems.

Features

Query Gemini, OpenAI (GPT/Codex), Grok, and Perplexity simultaneously
Side-by-side comparison of responses
Extensible provider system - add new AI agents easily
Proactive suggestions for architecture decisions and debugging

Installation

From Marketplace (Recommended)

# Add the hex-plugins marketplace
/plugin marketplace add hex/claude-marketplace

# Install claude-council
/plugin install claude-council

Direct from GitHub

/plugin install hex/claude-council

Manual

# Clone to your plugins directory
git clone https://github.com/hex/claude-council.git
claude --plugin-dir /path/to/claude-council

Configuration

API Keys

Set environment variables (recommended):

export GEMINI_API_KEY="your-key"
export OPENAI_API_KEY="your-key"
export GROK_API_KEY="your-key"
export PERPLEXITY_API_KEY="your-key"

Or create .claude/claude-council.local.md in your project:

---
providers:
  gemini:
    api_key: "your-key"
  openai:
    api_key: "your-key"
  grok:
    api_key: "your-key"
  perplexity:
    api_key: "your-key"
---

Model Selection

Override default models via environment variables:

export GEMINI_MODEL="gemini-3.1-pro-preview"       # default
export OPENAI_MODEL="gpt-5.4"                       # default
export GROK_MODEL="grok-4"                          # default
export PERPLEXITY_MODEL="sonar-reasoning-pro"       # default (reasoning + search)

Use more powerful models for complex queries:

export GEMINI_MODEL="gemini-3.1-pro-preview"
export OPENAI_MODEL="gpt-5.4"
export GROK_MODEL="grok-4"
export PERPLEXITY_MODEL="sonar-reasoning-pro"      # with chain-of-thought

Response Length

Control max tokens per response (default: 2048):

export COUNCIL_MAX_TOKENS=4096  # longer responses
export COUNCIL_MAX_TOKENS=1024  # shorter, faster responses

OpenAI Reasoning Models

For OpenAI reasoning models (codex-*, *-codex, o3-*, o4-*, gpt-5.[4-9]*), the token limit is automatically increased to 8x the base value (minimum 32768). This is because these models combine reasoning tokens and output tokens into a single max_output_tokens limit.

Model Type	COUNCIL_MAX_TOKENS	Actual Limit
Standard (gpt-4o)	2048 (default)	2048
Reasoning (gpt-5.4)	2048 (default)	32768
Reasoning (gpt-5.4)	4096	32768

Control reasoning effort to balance speed vs thoroughness:

export OPENAI_REASONING_EFFORT="low"     # faster, less reasoning overhead
export OPENAI_REASONING_EFFORT="medium"  # default - balanced
export OPENAI_REASONING_EFFORT="high"    # thorough reasoning, slower

Gemini and Grok handle reasoning/thinking tokens separately, so they use the base limit directly.

Perplexity Search Features

Perplexity's sonar models are search-augmented, providing web-grounded responses with citations:

# Filter search results by recency: day, week, month, year
export PERPLEXITY_RECENCY="week"

Available models:

sonar - Fast, search-enabled
sonar-pro - More capable, search-enabled
sonar-reasoning - Chain-of-thought reasoning + search
sonar-reasoning-pro - Best reasoning + search (default)

Perplexity is useful when you need current information (latest framework versions, recent best practices) rather than just training-data knowledge.

Retry & Timeout Configuration

Automatic retry on transient failures (429 rate limits, 5xx server errors):

export COUNCIL_MAX_RETRIES=3    # default: 3 retries
export COUNCIL_RETRY_DELAY=1    # default: 1 second initial delay (doubles each retry)
export COUNCIL_TIMEOUT=60       # default: 60 seconds per request

Timeouts fail fast (no retry) to prevent blocking on hung providers.

Usage

Quick Reference

Flag	Description
`--providers=list`	Query specific providers (e.g., `gemini,openai`)
`--roles=list`	Assign roles to providers (e.g., `security,performance` or `balanced`)
`--debate`	Enable two-round debate mode
`--file=path`	Include specific file in context
`--output=path`	Export response to markdown file
`--quiet`	Show only synthesis, hide individual responses
`--agents`	Agent-enhanced analysis with subagents (slower, deeper)
`--no-cache`	Force fresh queries, skip cache
`--no-auto-context`	Disable automatic file detection

Slash Command

# Query all configured providers
/claude-council:ask "How should I structure authentication in this Express app?"

# Query specific providers
/claude-council:ask --providers=gemini,openai "What's the best approach for caching here?"

# Include a specific file for review
/claude-council:ask --file=src/auth.ts "What's wrong with this implementation?"

# Export response to markdown file
/claude-council:ask --output=docs/auth-decision.md "How should we implement authentication?"

# Quiet mode - show only synthesis
/claude-council:ask --quiet "What's the best caching strategy?"

Export to File

Save council responses as clean markdown files for documentation or sharing:

/claude-council:ask --output=docs/decision.md "Should we use REST or GraphQL?"

The exported file includes:

Metadata header (query, date, providers)
Each provider's full response
Synthesis with consensus/divergence analysis

Great for:

Documenting architectural decisions
Sharing with team members who aren't using Claude
Creating an audit trail of AI-assisted decisions

Quiet Mode

Get just the bottom line without individual provider responses:

/claude-council:ask --quiet "Should I use Redis or Memcached?"

Quiet mode still queries all providers and analyzes their responses, but only shows the synthesis with consensus/divergence analysis. Use when you want a quick answer without scrolling through multiple perspectives.

Response Caching

Responses are automatically cached to speed up repeated queries and save API costs:

# Uses cache if available (default)
/claude-council:ask "What's the best testing framework?"

# Force fresh queries, skip cache
/claude-council:ask --no-cache "What's the best testing framework?"

Cache configuration:

export COUNCIL_CACHE_DIR=".claude/council-cache"  # Cache location (default)
export COUNCIL_CACHE_TTL=3600                      # Cache lifetime in seconds (default: 1 hour)

Cached responses show cached instead of success in the status output. Cache is keyed by prompt + provider + model + role, so:

Changing models invalidates the cache
Using --roles creates separate cache entries (same prompt with different role = cache miss)
Debate mode round 2 rebuttals are never cached (they depend on round 1 content)

Auto-Context Injection

The council automatically detects and includes relevant files based on your question:

/claude-council:ask "How should I refactor the authentication flow?"
# Auto-detects and includes: src/auth/*.ts, middleware/auth.ts, etc.

Before querying, you'll see which files were auto-included:

Auto-included context (3 files):
  - src/auth/handler.ts (keyword: "auth")
  - middleware/session.ts (keyword: "session")
  - types/user.ts (keyword: "user")

To disable auto-context (for general questions not about your code):

/claude-council:ask --no-auto-context "What are best practices for API design?"

Auto-context limits:

Maximum 5 files included
Maximum ~10,000 tokens of context
Skipped if you provide --file= explicitly

Specialized Roles

Assign different perspectives to each provider for more comprehensive reviews:

# Use specific roles
/claude-council:ask --roles=security,performance,maintainability "Review this auth code"

# Use a preset
/claude-council:ask --roles=balanced "Review this implementation"

Available roles:

security - Security Auditor (vulnerabilities, OWASP Top 10)
performance - Performance Optimizer (efficiency, bottlenecks)
maintainability - Maintainability Advocate (clarity, future changes)
devil - Devil's Advocate (challenges assumptions)
simplicity - Simplicity Champion (identifies over-engineering)
scalability - Scalability Architect (growth, scaling)
dx - Developer Experience (API ergonomics)
compliance - Compliance Officer (GDPR, regulations)

Presets:

balanced - security, performance, maintainability
security-focused - security, devil, compliance
architecture - scalability, maintainability, simplicity
review - security, maintainability, dx

Roles are assigned to providers in order, ensuring each provider approaches the question from a different angle.

Debate Mode

Enable multi-round discussions where providers critique each other:

/claude-council:ask --debate "How should I structure the database schema?"

How it works:

Round 1: All providers answer the question normally
Round 2: Each provider sees the others' responses and provides rebuttals
Synthesis: Incorporates debate insights, consensus shifts, and unresolved tensions

Debate mode surfaces blind spots and stress-tests recommendations. The synthesis includes:

Strongest criticisms raised
Where providers changed positions after seeing alternatives
Genuine disagreements that remain

Combine with roles for focused debates:

/claude-council:ask --debate --roles=security,performance,simplicity "Review this architecture"

Agent-Enhanced Analysis (--agents)

For complex decisions where deeper analysis justifies the extra time and cost, --agents spawns parallel Claude subagents that each independently query, evaluate, and analyze their provider's response before the orchestrator synthesizes everything.

# Explicit flag
/claude-council:ask --agents "Should we migrate from REST to GraphQL? What are the tradeoffs?"

# Combine with other flags
/claude-council:ask --agents --roles=security,scalability --providers=gemini,openai "Review this auth architecture"

What each subagent does (beyond a simple API call):

Queries the provider
Evaluates response quality - did it actually address the question?
If the response is vague or off-topic, reformulates and retries
Asks follow-up questions to surface deeper insights
Extracts structured analysis: key recommendations, confidence level, blind spots

Enhanced synthesis includes:

Confidence-weighted consensus (high-confidence agreement weighted more)
Cross-provider blind spot analysis
Divergence with context (why providers disagree)

Natural language triggers: The command also detects complex questions automatically. If your question contains architecture, security review, tradeoff analysis, or similar signals, you'll be asked whether to enable agent mode.

Cost and performance implications: Agent mode spawns one Claude subagent per provider. This means ~4x more Claude API usage and ~15-25 seconds additional latency compared to standard mode. Use it for high-stakes decisions, not quick questions.

	Standard (default)	Agent-enhanced (--agents)
Speed	~3-5s	~15-25s
Claude API cost	1 context	1 + N providers
Provider API cost	Same	Same
Analysis depth	Raw responses + synthesis	Pre-analyzed + enhanced synthesis
Best for	Quick questions, factual queries	Architecture decisions, security reviews, complex tradeoffs

Proactive Agent

The council-advisor agent will suggest consulting the council when:

Discussing architecture or design decisions
Stuck debugging after multiple failed attempts

Adding New Providers

See the provider-integration skill for guidance on adding new AI providers.

Direct Script Usage

Use the scripts directly without Claude Code (for automation, CI, or debugging):

# Basic query - returns JSON
bash scripts/query-council.sh -- "What is dependency injection?"

# With flags
bash scripts/query-council.sh --providers=gemini,openai --roles=balanced -- "Review this pattern"

# Pipe to formatter for terminal display
bash scripts/query-council.sh --providers=gemini -- "Question" 2>/dev/null | bash scripts/format-output.sh

# Check provider status
bash scripts/check-status.sh

JSON output structure:

{
  "metadata": {
    "prompt": "...",
    "roles_used": ["security", "performance"],
    "debate_mode": false,
    "quiet_mode": false,
    "timestamp": "2025-12-18T12:00:00Z"
  },
  "round1": {
    "gemini": { "status": "success", "response": "...", "model": "...", "role": "security" },
    "openai": { "status": "success", "response": "...", "model": "...", "role": "performance" }
  },
  "round2": { ... }  // Only present if --debate
}

Requirements

curl and jq for API calls
Valid API keys for desired providers

Development

Architecture

See docs/ARCHITECTURE.md for system design, data flow diagrams, and component details.

Versioning

Bump version in .claude-plugin/plugin.json on every release:

{
  "version": "YYYY.M.PATCH"
}

Format: YYYY.M.PATCH where PATCH increments with each release. Examples: 2026.1.1, 2026.1.2, 2026.2.1.

Always bump version when:

Changing command behavior
Fixing bugs
Updating formatting/output
Any change users should pull

Testing

# Run automated tests (requires bats-core)
./tests/run_tests.sh

# Run specific test suite
bats tests/cache.bats
bats tests/roles.bats

See TESTING.md for complete test documentation including manual test procedures.

Release Checklist

Make changes
Test locally
Bump version in .claude-plugin/plugin.json
Commit and push
Users update via /plugin update claude-council

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
.claude-plugin		.claude-plugin
.cs		.cs
agents		agents
claude-council-workspace		claude-council-workspace
commands		commands
config		config
docs		docs
scripts		scripts
skills		skills
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
TESTING.md		TESTING.md

Folders and files

Latest commit

History

Repository files navigation

claude-council

Features

Installation

From Marketplace (Recommended)

Direct from GitHub

Manual

Configuration

API Keys

Model Selection

Response Length

OpenAI Reasoning Models

Perplexity Search Features

Retry & Timeout Configuration

Usage

Quick Reference

Slash Command

Export to File

Quiet Mode

Response Caching

Auto-Context Injection

Specialized Roles

Debate Mode

Agent-Enhanced Analysis (--agents)

Proactive Agent

Adding New Providers

Direct Script Usage

Requirements

Development

Architecture

Versioning

Testing

Release Checklist

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages