An intelligent wrapper around Claude Agent SDK that provides automated problem-solving capabilities with session management, progress monitoring, and intelligent feedback loops.
- Automated Problem Solving: Describes problems to Claude Code and gets complete solutions
- Session Management: Maintains context across multiple iterations with intelligent workflow orchestration
- Progress Monitoring: Real-time tracking of Claude's progress via todo list updates and output analysis
- Intelligent Feedback Loop: LLM-powered guidance generation that analyzes Claude's work and provides specific, actionable feedback when issues arise
- π Custom Tools Support (v1.1.0): Extend capabilities by providing custom tools to coding agent, supervisor, or both
- Plan Mode: Intelligent plan generation, review, and iterative refinement before execution (inspired by validation patterns)
- Custom Prompt Overrides: Separate customizable prompts for execution and plan mode instructions
- Dual Graph Architecture: Independent plan graph and execution graph for clean separation of concerns
- Data I/O Support: Handles various data formats (lists, dicts, CSV, DataFrames, etc.)
- Custom Prompts: Guide implementation toward specific patterns or requirements
- Test Automation: Automatically generates and runs tests for solutions
- Multiple Providers: Support for Anthropic, AWS Bedrock, and OpenAI
Claude Code Supervisor uses a two-agent architecture where agents work together to solve coding problems:
-
Coding Agent (Claude Code)
- Autonomous coding assistant that writes, edits, and tests code
- Has access to tools: Read, Write, Edit, Bash, Grep, etc.
- Can plan its own work using todo lists
- Executes the actual implementation
-
Supervisor Agent (LLM)
- Orchestrates the coding process
- Analyzes the coding agent's work
- Generates intelligent feedback when issues arise
- Decides when solution is complete or needs refinement
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SUPERVISOR AGENT (LLM) β
β β’ Receives problem description from user β
β β’ Generates instructions for coding agent β
β β’ Reviews results and provides feedback β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β Instructions
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CODING AGENT (Claude Code) β
β β’ Reads the instructions β
β β’ Plans work with todo lists β
β β’ Writes/edits code using tools β
β β’ Runs tests to verify solution β
β β’ Reports results back β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β Results (code, tests, logs)
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SUPERVISOR AGENT (LLM) β
β β’ Analyzes results and test outcomes β
β β’ If successful: Done! β β
β β’ If failed: Generate specific feedback β repeat cycle β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Separation of Concerns: Coding agent focuses on implementation, supervisor focuses on strategy
- Intelligent Feedback: Supervisor provides targeted guidance instead of generic retries
- Autonomous Operation: No human intervention needed during iterations
- Cost Effective: Use expensive models (GPT-4) for supervisor, efficient models for coding
- Extensible: Add custom tools to either agent independently
# User provides problem
agent = FeedbackSupervisorAgent()
result = agent.process("Create a sorting algorithm with tests")
# Behind the scenes:
# 1. Supervisor β Coding Agent: "Create a sorting function with comprehensive tests"
# 2. Coding Agent: Writes sort.py and test_sort.py, runs tests
# 3. Coding Agent β Supervisor: "Tests failing: IndexError on line 15"
# 4. Supervisor analyzes: "Array bounds issue detected"
# 5. Supervisor β Coding Agent: "Fix the index bounds check in line 15"
# 6. Coding Agent: Fixes the bug, re-runs tests
# 7. Coding Agent β Supervisor: "All tests passing!"
# 8. Supervisor: "Solution complete!" βpip install claude-code-supervisorgit clone https://github.com/vinyluis/claude-code-supervisor.git
cd claude-code-supervisor
pip install -e .-
Claude Code CLI:
npm install -g @anthropic-ai/claude-code
-
API Key (choose one):
# Anthropic (default) export ANTHROPIC_API_KEY=<YOUR_ANTHROPIC_API_KEY> # AWS Bedrock export AWS_ACCESS_KEY_ID=<YOUR_AWS_ACCESS_KEY_ID> export AWS_SECRET_ACCESS_KEY=<YOUR_AWS_SECRET_ACCESS_KEY> export AWS_REGION=<AWS_REGION>
-
LLM API Key (for guidance, choose one):
# OpenAI (default) export OPENAI_API_KEY="your-openai-api-key" # Configure supervisor_config.json with "provider": "openai" # AWS Bedrock (for guidance LLM) # Will use the access keys above # Configure supervisor_config.json with "provider": "bedrock"
from claude_code_supervisor import SingleShotSupervisorAgent
# Initialize the agent
agent = SingleShotSupervisorAgent()
# Solve a problem
result = agent.process(
"Create a function to calculate fibonacci numbers",
solution_path='solution.py',
test_path='test_solution.py'
)
if result.is_solved:
print(f"Solution: {agent.solution_path}")
print(f"Tests: {agent.test_path}")Claude Code Supervisor provides two main supervisor types for different use cases:
Iterative supervisor with intelligent feedback loops - continues refining solutions until success or max iterations:
from claude_code_supervisor import FeedbackSupervisorAgent
agent = FeedbackSupervisorAgent()
result = agent.process("Create a complex sorting algorithm")
# Will iterate with intelligent feedback until solvedBest for:
- Complex problems requiring multiple iterations
- Maximum solution quality with automated improvement
- Problems where first attempts commonly fail
- When you want intelligent error analysis and guidance
Single-execution supervisor without iteration - fast, deterministic results:
from claude_code_supervisor import SingleShotSupervisorAgent
agent = SingleShotSupervisorAgent()
result = agent.process("Create a simple utility function")
# Executes once and reports resultsBest for:
- Simple problems that don't require iteration
- Fast code generation and testing
- When iteration is handled externally
- Benchmarking Claude Code capabilities
# Process data with input/output examples
result = agent.process(
"Sort this list in ascending order",
input_data=[64, 34, 25, 12, 22, 11, 90, 5],
output_data=[5, 11, 12, 22, 25, 34, 64, 90]
)# Guide implementation style
agent = FeedbackSupervisorAgent(
append_system_prompt="Use object-oriented programming with SOLID principles"
)
result = agent.process("Create a calculator with basic operations")
# π Custom prompts for both execution and plan mode
result = agent.process(
"Create a data processing pipeline",
instruction_prompt="Use functional programming with immutable data structures",
plan_mode_instruction_prompt="Focus on scalability and performance optimization",
enable_plan_mode=True
)# Use your own LangChain LLM for guidance
from langchain_openai import ChatOpenAI
custom_llm = ChatOpenAI(model='gpt-4o-mini', temperature=0.2)
agent = FeedbackSupervisorAgent(llm=custom_llm)
result = agent.process("Create a data processing function")# Pass configuration as type-safe dataclass
from claude_code_supervisor import FeedbackSupervisorAgent
from claude_code_supervisor.config import openai_config
config = openai_config(model_name='gpt-4o-mini', temperature=0.1)
config.agent.max_iterations = 3
config.claude_code.max_turns = 25
agent = FeedbackSupervisorAgent(config=config)
result = agent.process(
"Create a web scraper function",
solution_path='scraper.py',
test_path='test_scraper.py'
)# Enable plan mode with intelligent review and refinement
from claude_code_supervisor import FeedbackSupervisorAgent
from claude_code_supervisor.config import plan_mode_config
config = plan_mode_config(
max_plan_iterations=3,
plan_auto_approval_threshold=0.8,
plan_review_enabled=True
)
agent = FeedbackSupervisorAgent(config=config)
result = agent.process(
"Create a comprehensive calculator module with advanced operations",
solution_path='calculator.py',
test_path='test_calculator.py',
enable_plan_mode=True # Enable plan mode for this specific task
)
# Alternative: Pass enable_plan_mode directly without config changes
agent = FeedbackSupervisorAgent()
result = agent.process(
"Create a web scraper with error handling",
enable_plan_mode=True,
plan_mode_instruction_prompt="Focus on robustness and rate limiting"
)
# Plan mode workflow:
# 1. Generates execution plan using Claude Code's plan mode
# 2. LLM reviews plan and scores quality (0.0-1.0)
# 3. Iteratively refines plan based on feedback (if needed)
# 4. Executes approved plan with full implementationExtend supervisor capabilities by providing custom tools to either the coding agent (Claude Code) or the supervisor LLM, or both:
from claude_agent_sdk import tool
from claude_code_supervisor import FeedbackSupervisorAgent
from claude_code_supervisor.config import SupervisorConfig, ClaudeCodeConfig
# Define a custom tool using @tool decorator
@tool(
name='validate_code_style',
description='Validates code follows project style guidelines',
input_schema={'code': str, 'language': str}
)
async def validate_code_style(args: dict) -> dict:
'''Custom validation logic for your project'''
code = args['code']
language = args.get('language', 'python')
# Your validation logic here
is_valid = True # ... validation code ...
return {
'content': [{
'type': 'text',
'text': f'Code style is {"valid" if is_valid else "invalid"}'
}]
}
# Scenario 1: Tools for coding agent only (extends Claude Code capabilities)
config = SupervisorConfig(
claude_code=ClaudeCodeConfig(
coding_agent_tools=[validate_code_style]
)
)
agent = FeedbackSupervisorAgent(config=config)
result = agent.process('Create a Python module with proper style')
# Scenario 2: Tools for both agents (shared capabilities)
@tool('analyze_complexity', 'Analyze code complexity', {'code': str})
async def analyze_complexity(args: dict) -> dict:
# Complexity analysis logic
return {'content': [{'type': 'text', 'text': 'Complexity: O(n)'}]}
config = SupervisorConfig(
claude_code=ClaudeCodeConfig(
shared_tools=[analyze_complexity] # Available to both supervisor and coding agent
)
)
agent = FeedbackSupervisorAgent(config=config)
# Scenario 3: Separate tools for each agent
from langchain_core.tools import tool as langchain_tool
# Define a LangChain tool for supervisor
@langchain_tool
def review_architecture(code: str) -> str:
'''Reviews code architecture and provides feedback'''
# Supervisor-specific analysis
return 'Architecture looks good'
config = SupervisorConfig(
claude_code=ClaudeCodeConfig(
coding_agent_tools=[validate_code_style], # SDK tools for Claude Code
supervisor_agent_tools=[review_architecture], # LangChain tools for supervisor LLM
shared_tools=[analyze_complexity] # SDK tools for Claude Code (both agents coming soon)
)
)Tool Types:
coding_agent_tools: SDK tools (using@tooldecorator fromclaude_agent_sdk) for Claude Code agentsupervisor_agent_tools: LangChain tools (using@toolfromlangchain_core.tools) for supervisor LLMshared_tools: SDK tools for Claude Code agent (supervisor support coming in future release)
Benefits of Custom Tools:
- Extend capabilities per project without modifying supervisor code
- Reusable tools across different projects
- Type-safe tool definitions with appropriate decorators
- Clean separation: tools can target specific agents
- Full access to SDK's MCP features for coding agent
- LangChain's tool ecosystem for supervisor LLM
Use Cases:
- Project-specific validation (coding style, architecture patterns)
- Domain-specific analysis (database queries, API contracts)
- Integration with external services (linters, formatters, CI/CD)
- Custom test frameworks or validation logic
- Data processing and transformation utilities
See SDK_UPGRADE_PLAN.md Appendix for more detailed examples.
# Use structured, type-safe configuration with dataclasses
from claude_code_supervisor import FeedbackSupervisorAgent
from claude_code_supervisor.config import (
SupervisorConfig, AgentConfig, ClaudeCodeConfig,
development_config, openai_config, bedrock_config
)
# Method 1: Use convenience functions
config = development_config() # Pre-configured for development
agent = FeedbackSupervisorAgent(config=config)
# Method 2: Use builder functions with customization
config = openai_config(model_name='gpt-4o-mini', temperature=0.2)
config.agent.max_iterations = 5
agent = FeedbackSupervisorAgent(config=config)
# Method 3: Build from scratch with type safety
config = SupervisorConfig(
agent=AgentConfig(
model_name='gpt-4o',
temperature=0.1,
provider='openai',
max_iterations=3,
test_timeout=60
),
claude_code=ClaudeCodeConfig(
max_turns=20,
use_bedrock=False,
tools=['Read', 'Write', 'Edit', 'Bash', 'TodoWrite'] # Custom tool set
)
)
agent = FeedbackSupervisorAgent(config=config)
result = agent.process(
"Create a validation function",
solution_path='validator.py',
test_path='test_validator.py'
)# Use dataclass config + custom LLM together
from langchain_aws import ChatBedrockConverse
from claude_code_supervisor import FeedbackSupervisorAgent
from claude_code_supervisor.config import SupervisorConfig, AgentConfig
# Custom LLM for guidance
guidance_llm = ChatBedrockConverse(
model='anthropic.claude-3-haiku-20240307-v1:0',
temperature=0.1,
)
# Type-safe configuration (model settings in custom LLM are ignored when llm is provided)
config = SupervisorConfig(
agent=AgentConfig(max_iterations=2, test_timeout=45)
)
agent = FeedbackSupervisorAgent(config=config, llm=guidance_llm)
result = agent.process(
"Create a file parser",
solution_path='parser.py',
test_path='test_parser.py'
)The supervisor supports various data formats:
- Lists:
[1, 2, 3, 4] - Dictionaries:
{"name": "Alice", "age": 30} - Pandas DataFrames: For data analysis tasks
- NumPy Arrays: For numerical computations
- Strings: Text processing tasks
- CSV Data: Business logic and data processing
Check out the examples directory for detailed usage examples:
- Basic Usage (
basic_usage.py): Simple problem solving without I/O - π Plan Mode (
plan_mode_example.py): Intelligent planning with review and refinement - Data Processing:
list_sorting_example.py: Working with lists and numbersdictionary_processing_example.py: Processing employee dictionariescsv_processing_example.py: Complex inventory data processing
- Custom Prompts:
oop_prompt_example.py: Object-oriented programming patternsperformance_prompt_example.py: Performance-optimized implementationsdata_science_prompt_example.py: Data science best practices with pandas
SupervisorAgent uses type-safe dataclass configuration for better IDE support and validation:
from claude_code_supervisor import FeedbackSupervisorAgent
from claude_code_supervisor.config import openai_config, bedrock_config
# OpenAI configuration
config = openai_config(model_name='gpt-4o-mini', temperature=0.2)
agent = FeedbackSupervisorAgent(config=config)
# AWS Bedrock configuration
config = bedrock_config(
model_name='anthropic.claude-3-haiku-20240307-v1:0',
)
agent = FeedbackSupervisorAgent(config=config)from claude_code_supervisor import FeedbackSupervisorAgent
from claude_code_supervisor.config import SupervisorConfig, AgentConfig, ClaudeCodeConfig
# Build custom configuration
config = SupervisorConfig(
agent=AgentConfig(
model_name='gpt-4o',
temperature=0.1,
provider='openai',
max_iterations=5,
test_timeout=60
),
claude_code=ClaudeCodeConfig(
max_turns=25,
use_bedrock=False,
max_thinking_tokens=8000
)
)
agent = FeedbackSupervisorAgent(config=config)from claude_code_supervisor import FeedbackSupervisorAgent
from claude_code_supervisor.config import (
development_config, production_config,
plan_mode_config, plan_mode_development_config
)
# Development environment (uses gpt-4o-mini, higher iterations)
dev_config = development_config()
dev_agent = FeedbackSupervisorAgent(config=dev_config)
# Production environment (uses gpt-4o, optimized settings)
prod_config = production_config()
prod_agent = FeedbackSupervisorAgent(config=prod_config)
# π Plan mode configurations
# Thorough plan review for complex tasks
plan_config = plan_mode_config(
max_plan_iterations=5,
plan_auto_approval_threshold=0.9
)
plan_agent = FeedbackSupervisorAgent(config=plan_config)
# Plan mode optimized for development
dev_plan_config = plan_mode_development_config()
dev_plan_agent = FeedbackSupervisorAgent(config=dev_plan_config)Claude Code has access to various tools. By default, all tools are enabled, but you can customize which tools are available:
from claude_code_supervisor import FeedbackSupervisorAgent
from claude_code_supervisor.config import SupervisorConfig, ClaudeCodeConfig
from claude_code_supervisor.utils import ToolsEnum
# All tools (default)
config = SupervisorConfig(
claude_code=ClaudeCodeConfig(tools=ToolsEnum.all())
)
# Custom tool set
config = SupervisorConfig(
claude_code=ClaudeCodeConfig(
tools=['Read', 'Write', 'Edit', 'Bash', 'TodoWrite', 'LS', 'Grep']
)
)
# Minimal tools for simple tasks
from claude_code_supervisor.config import minimal_tools_config
config = minimal_tools_config()
# Notebook-focused tools
from claude_code_supervisor.config import notebook_config
config = notebook_config()Available Tools:
Read,Write,Edit,MultiEdit- File operationsBash- Command executionLS,Glob,Grep- File system navigation and searchTodoWrite- Task managementNotebookRead,NotebookEdit- Jupyter notebook supportWebFetch,WebSearch- Web accessAgent- Delegate tasks to other agents
The supervisor now uses two independent LangGraph workflows:
- Plan Graph: Handles plan generation, review, and refinement (when plan mode is enabled)
- Execution Graph: Handles the main implementation workflow
This separation provides cleaner architecture, better error isolation, and more focused workflows.
The _claude_run method now supports both PlanState and WorkflowState directly, eliminating unnecessary state conversions and improving performance.
Core utility functions are organized in claude_code_supervisor.utils:
from claude_code_supervisor.utils import is_quota_error, node_encountered_quota_error
# Check for API quota/credit errors in text
if is_quota_error(error_message):
print("API quota exceeded")
# Check if workflow state indicates quota errors
if node_encountered_quota_error(workflow_state):
print("Node encountered quota error")Run the test suite:
# Run all tests
pytest
# Run with coverage
pytest --cov=claude_code_supervisor
# Run specific test categories
pytest -m "unit"
pytest -m "integration"We welcome contributions! Please see our Contributing Guidelines for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Claude Agent SDK for the core Claude Code integration
- LangGraph for workflow orchestration
- LangChain for LLM integrations
For detailed usage examples, see the examples directory and the configuration examples above.
Found a bug? Have a feature request? Please open an issue.
Made with β€οΈ by VinΓcius Trevisan