SwiftSolve is a multi-agent framework that synthesizes functionally correct and computationally efficient C++ code from natural language problem statements. Unlike traditional code generation systems that focus solely on correctness, SwiftSolve co-optimizes for both functional correctness and Big-O efficiency through an iterative feedback loop between planning, coding, profiling, and analysis agents.
- Python 3.12+ (required for modern type hints and performance)
- API Keys: OpenAI GPT-4 and Anthropic Claude API access
# Clone the repository
#Then,
cd swiftsolve
# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Set required environment variables
export OPENAI_API_KEY="your_openai_api_key_here"
export ANTHROPIC_API_KEY="your_anthropic_api_key_here"
# Set Python path for imports
export PYTHONPATH="${PWD}/src"# Test basic functionality with a simple problem
python src/swiftsolve/main.py --task_json src/swiftsolve/test.jsonSwiftSolve employs a sophisticated multi-agent pipeline that iteratively refines code solutions:
graph TD
A[Natural Language Problem] --> B[Planner Agent<br/>Claude 4 Opus]
B --> C[Static Pruner<br/>Efficiency Heuristics]
C --> D[Coder Agent<br/>GPT-4.1]
D --> E[Profiler<br/> Sandbox]
E --> F[Analyst Agent<br/>Complexity Analysis]
F --> G{Solution<br/>Acceptable?}
G -->|No| H[Feedback Loop]
H --> D
G -->|Yes| I[Final Solution<br/>+ Performance Profile]
- Planner Agent (Claude 4 Opus): Converts natural language into structured algorithmic plans
- Static Pruner: Filters out obviously inefficient approaches using regex and AST analysis
- Coder Agent (GPT-4.1): Generates C++ code from approved plans
- Profiler: Compiles and benchmarks code with real performance measurements
- Analyst Agent: Evaluates efficiency using heuristics and LLM fallback for complex cases
Start the FastAPI server for programmatic access:
# Start the server
PYTHONPATH=src uvicorn swiftsolve.main:app --host 127.0.0.1 --port 8000 --reload
# Server will be available at http://localhost:8000
# API documentation at http://localhost:8000/docscurl -X POST "http://localhost:8000/solve" \
-H "Content-Type: application/json" \
-d '{
"task_id": "example_1",
"prompt": "Find the maximum element in an array",
"constraints": {"runtime_limit": 2000, "memory_limit": 512},
"unit_tests": [
{"input": "5\n3 1 4 1 5", "output": "5"},
{"input": "3\n10 20 30", "output": "30"}
]
}'# Full evaluation: 225 iterations (25 tasks Γ 3 seeds Γ 3 replans)
curl -X POST "http://localhost:8000/research/evaluate" \
-H "Content-Type: application/json" \
-d '{
"seeds": [42, 123, 456],
"max_workers": 4,
"output_dir": "research_results"
}'# Mini evaluation: 18 iterations (2 tasks Γ 3 seeds Γ 3 replans)
curl -X POST "http://localhost:8000/mini-research/evaluate" \
-H "Content-Type: application/json" \
-d '{
"seeds": [42, 123, 456],
"max_workers": 2,
"output_dir": "mini_results"
}'For direct command-line usage:
# Solve a single problem from JSON file
python src/swiftsolve/main.py --task_json datasets/bigobench/task_bigobench_1718.json
# Run batch evaluation
python -m src.swiftsolve.evaluation.batch_runner --benchmark --seeds 42 123 456# Test mini-run functionality
python test_mini_run.py
# Run built-in test suite
python dry_run_batch.py --tasks 5
# Run unit tests
python -m pytest src/swiftsolve/tests/- BigO(Bench): 16 algorithmic tasks with known complexity requirements
- Codeforces: 10 competitive programming problems with strict constraints
- pass@k: Functional correctness across k attempts
- eff@k_runtime: Efficiency-optimized success rate for runtime constraints
- eff@k_memory: Efficiency-optimized success rate for memory constraints
- TLE/MLE rates: Time/Memory limit exceeded frequencies
| Mode | Tasks | Iterations | Duration | Use Case |
|---|---|---|---|---|
| Mini-Run | 2 tasks | 18 iterations | ~5 minutes | Setup testing, development |
| Full Run | 25 tasks | 225 iterations | ~60 minutes | Research benchmarking |
# Quick setup verification
python test_mini_run.py
# Full research evaluation
curl -X POST "http://localhost:8000/research/evaluate" \
-H "Content-Type: application/json" \
-d '{"max_workers": 4}'| Variable | Purpose | Required |
|---|---|---|
OPENAI_API_KEY |
GPT-4.1 access | Yes |
ANTHROPIC_API_KEY |
Claude 4 Opus access | Yes |
LOG_LEVEL |
Logging verbosity (DEBUG/INFO/WARNING/ERROR) | No |
PYTHONPATH |
Python import path | Yes |
SwiftSolve compiles and executes C++ code directly on your system with resource limits:
# Verify C++ compiler is available
g++ --version
# or on macOS with Xcode
clang++ --version# Adjust worker count based on your system
export MAX_WORKERS=4 # Default: 4
# Enable debug logging for troubleshooting
export LOG_LEVEL=DEBUGswiftsolve/
βββ src/swiftsolve/ # Core framework
β βββ agents/ # Multi-agent components
β β βββ planner.py # Claude 4 Opus planning
β β βββ coder.py # GPT-4.1 code generation
β β βββ profiler.py # Performance profiling
β β βββ analyst.py # Efficiency analysis
β βββ api/ # FastAPI endpoints
β βββ controller/ # Pipeline orchestration
β βββ evaluation/ # Benchmarking infrastructure
β βββ sandbox/ # Code execution environment
β βββ schemas/ # Data models and validation
βββ datasets/ # Evaluation datasets
β βββ bigobench/ # BigO(Bench) tasks
β βββ codeforces/ # Codeforces problems
βββ baseline_test_api/ # Baseline evaluation results
βββ test_*.py # Test scripts
βββ requirements.txt # Dependencies
# Test single problem solving
python src/swiftsolve/main.py --task_json src/swiftsolve/test.json# Start server
PYTHONPATH=src uvicorn swiftsolve.main:app --host 127.0.0.1 --port 8000 &
# Test health endpoint
curl http://localhost:8000/healthz
# Test solve endpoint
curl -X POST "http://localhost:8000/solve" \
-H "Content-Type: application/json" \
-d '{"task_id": "test", "prompt": "Add two numbers", "constraints": {"runtime_limit": 1000}, "unit_tests": [{"input": "2 3", "output": "5"}]}'# Run mini evaluation (18 iterations, ~5 minutes)
python test_mini_run.py1. Import Errors
# Ensure PYTHONPATH is set correctly
export PYTHONPATH="${PWD}/src"2. C++ Compiler Issues
# On macOS, install Xcode command line tools
xcode-select --install
# On Ubuntu/Debian
sudo apt-get install build-essential
# On CentOS/RHEL
sudo yum groupinstall "Development Tools"3. API Key Issues
# Verify API keys are set
echo $OPENAI_API_KEY
echo $ANTHROPIC_API_KEY4. Port Already in Use
# Use different port
PYTHONPATH=src uvicorn swiftsolve.main:app --host 127.0.0.1 --port 80015. Memory Issues During Evaluation
# Reduce worker count
curl -X POST "http://localhost:8000/research/evaluate" \
-H "Content-Type: application/json" \
-d '{"max_workers": 2}'Enable detailed logging for troubleshooting:
export LOG_LEVEL=DEBUG
PYTHONPATH=src uvicorn swiftsolve.main:app --host 127.0.0.1 --port 8000- Mini-Run (18 iterations): ~5 minutes
- Full Evaluation (225 iterations): ~60 minutes
- Single Problem: ~15-30 seconds
- Success Rate: 60-80% (varies by dataset and constraints)
- CPU: 4+ cores recommended for parallel evaluation
- RAM: 8GB+ recommended
- Storage: 2GB+ for datasets and results
- Network: Stable internet for API calls
- Setup Environment: Follow Quick Start guide
- Run Mini Evaluation: Verify setup with
python test_mini_run.py - Full Evaluation: Run complete benchmark with research endpoint
- Analyze Results: Check generated JSON files in output directory
# Custom task evaluation
python -m src.swiftsolve.evaluation.batch_runner \
--custom-tasks datasets/my_tasks/ \
--seeds 42 123 456 789 \
--output-dir custom_results# Run baseline evaluation
curl -X POST "http://localhost:8000/baseline/evaluate" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4", "max_workers": 4}'If you use SwiftSolve in your research, please cite:
@article{swiftsolve2025,
title={SwiftSolve: Multi-Agent Code Generation with Efficiency Optimization},
author={Your Name and Collaborators},
journal={NeurIPS},
year={2025}
}We welcome contributions! Please see our contributing guidelines and submit pull requests for any improvements.
This project is licensed under the MIT License - see the LICENSE file for details.
For questions, issues, or support:
- Check the troubleshooting section above
- Review existing issues on GitHub
- Create a new issue with detailed information about your setup and problem
Note: This framework requires API access to OpenAI GPT-4 and Anthropic Claude. Ensure you have valid API keys and sufficient credits before running evaluations.