A comprehensive algorithmic trading platform combining AI-powered strategy generation with automated data quality assurance and QuantConnect LEAN integration.
LeanBacktester transforms natural language trading strategy descriptions into executable C# algorithms, validates data quality, and provides seamless backtesting capabilities.
- Convert text descriptions into complete QuantConnect algorithms (C# or Python)
- Automatic compilation verification and iterative error fixing (C#)
- Data requirements analysis and extraction
- Professional project structure creation
- Support for both C# and Python algorithm generation
- AI-powered analysis of downloaded market data
- Statistical validation and integrity checks
- Automated quality scoring and recommendations
- Support for CSV, Parquet, HDF5, and other formats
- Multi-source data acquisition (Alpaca, Binance, Polygon, Databento, etc.)
- Interactive command-line interface
- Automatic LEAN format conversion
- Rate limiting and error handling
- Full QuantConnect LEAN integration
- C# and Python algorithm support
- Automated project setup
- Performance analysis and reporting
- Clone the repository:
git clone https://github.com/arithmax-research/LeanBacktester.git
cd LeanBacktester- Install dependencies:
pip install -r requirements.txt
cd data_pipeline && pip install -r requirements.txt && cd ..- Install LEAN CLI:
pip install lean- Configure Gemini API:
# Copy the example file
cp .env.example .env
# Edit .env and add your Gemini API key:
GEMINI_API_KEY=your_gemini_api_key_hereGetting a Gemini API Key:
- Visit Google AI Studio
- Sign in with your Google account
- Click "Create API Key"
- Copy the key and add it to your
.envfile
Note: Gemini API has a generous free tier perfect for generating trading strategies.
Create a strategy from a text description in either Python or C#:
Generate Python Algorithm:
# Create strategy file
echo "Buy SPY when RSI < 30, sell when RSI > 70" > strategy_prompts/my_strategy.txt
# Generate Python algorithm
python rag_agent.py strategy_prompts/my_strategy.txt --lang pythonGenerate C# Algorithm (default):
# Generate C# algorithm (default behavior)
python rag_agent.py strategy_prompts/my_strategy.txt --lang csharp
# or simply
python rag_agent.py strategy_prompts/my_strategy.txtThe system will:
- Generate complete Python or C# code
- Extract data requirements
- Verify compilation (C# only)
- Create LEAN project structure
- Validate Python syntax (Python only)
Use the interactive data pipeline:
# Launch data downloader
python data_pipeline/interactive.py
# Or download directly
cd data_pipeline
python main.py --source alpaca --equity-symbols AAPL MSFT --resolution dailyAnalyze downloaded data:
# Quality check all data files
python data_quality_checker.pyExecute strategies in LEAN:
# Navigate to generated strategy
cd arithmax-strategies/YourStrategyName
# Run backtest
lean backtest
# Generate report
lean reportLeanBacktester/
├── rag_agent.py # AI strategy generation
├── deep_seek_coder.py # DeepSeek API integration
├── data_quality_checker.py # Data quality analysis
├── arithmax-strategies/ # Generated strategies
├── data_pipeline/ # Data acquisition system
│ ├── interactive.py # Interactive downloader
│ ├── main.py # Command-line pipeline
│ └── [downloaders]/ # Data source modules
├── data/ # Downloaded market data
├── strategy_prompts/ # Strategy descriptions
├── requirements.txt # Python dependencies
└── README.md # This file
Required for full functionality:
DEEP_SEEK_API: AI strategy generationALPACA_API_KEY/SECRET: US equity dataBINANCE_API_KEY/SECRET: Cryptocurrency data
Initialize LEAN environment:
lean init
lean loginPython Strategy Generation:
# Generate Python RSI mean reversion strategy
python rag_agent.py strategy_prompts/simple_rsi_python.txt --lang python
# Generate Python EMA crossover strategy
python rag_agent.py strategy_prompts/ema_crossover_python.txt --lang python
# Custom Python strategy
echo "Buy when price > 200-day MA, sell when price < 200-day MA" > momentum.txt
python rag_agent.py momentum.txt --lang pythonC# Strategy Generation (default):
# Generate C# momentum strategy (default behavior)
echo "Buy when price > 200-day MA, sell when price < 200-day MA" > momentum.txt
python rag_agent.py momentum.txt --lang csharp
# or simply
python rag_agent.py momentum.txt# Download tech stocks
python data_pipeline/interactive.py
# Select: alpaca -> AAPL, MSFT, GOOGL -> daily# Check all downloaded data
python data_quality_checker.py
# Review AI-generated quality reports# Run generated Python strategy
cd arithmax-strategies/RsiMeanReversionStrategy
lean backtest
# Run generated C# strategy
cd arithmax-strategies/MomentumStrategy
lean backtest --start 20200101 --end 20241231The repository includes example strategy descriptions to help you get started:
Python Examples:
-
simple_rsi_python.txt - RSI Mean Reversion Strategy
- Trades SPY based on RSI(14) signals
- Buys when RSI < 30 (oversold)
- Sells when RSI > 70 (overbought)
- Simple mean reversion approach
python rag_agent.py strategy_prompts/simple_rsi_python.txt --lang python
-
ema_crossover_python.txt - EMA Crossover Momentum Strategy
- Trades QQQ using EMA crossovers
- Fast EMA: 20-period, Slow EMA: 50-period
- Buys on bullish crossover (fast > slow)
- Sells on bearish crossover (fast < slow)
- Classic trend-following strategy
python rag_agent.py strategy_prompts/ema_crossover_python.txt --lang python
These examples demonstrate different trading approaches and serve as templates for creating your own strategies.
The strategy generator supports both Python and C# for QuantConnect/LEAN algorithms. Choose the language that best fits your workflow:
Advantages:
- Simpler syntax, easier to read and modify
- No compilation step (faster generation)
- Rich ecosystem of data science libraries
- Ideal for rapid prototyping and research
Folder Structure:
strategy_name/
├── main.py # Algorithm entry point
├── config.json # LEAN configuration
└── research.ipynb # Jupyter notebook
Generation:
python rag_agent.py strategy_prompts/my_strategy.txt --lang pythonAdvantages:
- Better performance for complex strategies
- Stronger type safety
- Compilation catches errors early
- More examples in QuantConnect documentation
Folder Structure:
strategy_name/
├── Main.cs # Algorithm entry point
├── strategy_name.csproj # .NET project file
├── config.json # LEAN configuration
├── Research.ipynb # Jupyter notebook
├── bin/ # Build output
└── obj/ # Build intermediates
Generation (default):
python rag_agent.py strategy_prompts/my_strategy.txt --lang csharp
# or simply
python rag_agent.py strategy_prompts/my_strategy.txtNote: C# is the default language for backward compatibility with existing scripts.
- Python 3.8+
- .NET 6.0+ SDK (for C# strategies)
- QuantConnect LEAN CLI
- API keys for data sources
Licensed under the terms in LICENSE file.
For educational and research purposes only. Not financial advice.2. Set up LEAN CLI (follow LEAN CLI Setup above)
-
Set up the data pipeline:
cd data_pipeline chmod +x setup.sh ./setup.sh -
Configure API keys (create a
.envfile in the data_pipeline directory):# Alpaca API keys (required for equity data) ALPACA_API_KEY=your_alpaca_api_key ALPACA_SECRET_KEY=your_alpaca_secret_key # Binance API keys (optional for public crypto data) BINANCE_API_KEY=your_binance_api_key BINANCE_SECRET_KEY=your_binance_secret_key # Polygon API key (premium market data) POLYGON_API_KEY=your_polygon_api_key # Databento API key (institutional-grade data) DATABENTO_API_KEY=your_databento_api_key
-
Test the setup:
python test_setup.py
- Visit Alpaca Markets
- Create a free account
- Navigate to your dashboard and generate API keys
- Use paper trading keys for testing purposes
- Visit Binance
- Create an account
- Generate API keys in your account settings
- Note: Public data can be accessed without API keys
- Visit Polygon.io
- Sign up for an account (free tier available)
- Navigate to your dashboard to get your API key
- Supports stocks, options, forex, and crypto data
- Free tier: 5 API calls per minute, paid plans for higher limits
- Visit Databento
- Create an account (requires approval for institutional use)
- Generate API credentials in your account settings
- Provides tick-level data for equities, futures, and options
- Subscription-based pricing for professional data feeds
Download equity data from Alpaca:
cd data_pipeline
python main.py --source alpaca --equity-symbols AAPL GOOGL MSFT --resolution dailyDownload cryptocurrency data from Binance:
python main.py --source binance --crypto-symbols BTCUSDT ETHUSDT --resolution minuteDownload from Polygon:
python main.py --source polygon --equity-symbols AAPL TSLA --resolution minuteDownload from Databento:
python main.py --source databento --equity-symbols SPY QQQ --resolution tickDownload from multiple sources:
python main.py --source all --start-date 2023-01-01 --end-date 2023-12-31Initialize a new algorithm project:
# Create a new algorithm project
lean create-project --language python "MyAlgorithm"
# or for C#
lean create-project --language csharp "MyAlgorithm"Download sample data (free datasets):
# Download sample equity data
lean data download --dataset "US Equity Security Master"
# Download sample crypto data (if available)
lean data download --dataset "Crypto Price Data"Backtest an algorithm:
# Backtest a Python algorithm
cd Sample_Strategies/Python_Algorithms/DiversifiedLeverage
lean backtest
# Backtest a C# algorithm
cd Sample_Strategies/C#_Algorithms/DiversifiedLeverage
lean backtestLive trading setup (requires QuantConnect subscription):
# Deploy algorithm for live trading
lean live deployResearch environment:
# Start Jupyter research environment
lean researchPython Strategies:
Diversified Leverage Strategy:
cd Sample_Strategies/Python_Algorithms/DiversifiedLeverage
lean backtest
# Optional: specify custom config
lean backtest --config custom-config.jsonBollinger Bands Mean Reversion Strategy:
cd Sample_Strategies/Python_Algorithms/BollingerBandsMeanReversion
lean backtest
# Uses Bollinger Bands and RSI for mean reversion signalsCryptocurrency Momentum Strategy:
cd Sample_Strategies/Python_Algorithms/CryptoMomentum
lean backtest
# Multi-indicator momentum system for cryptocurrenciesC# Strategies:
Diversified Leverage Strategy:
cd Sample_Strategies/C#_Algorithms/DiversifiedLeverage
lean backtestMomentum Mean Reversion Strategy:
cd Sample_Strategies/C#_Algorithms/MomentumMeanReversion
lean backtest
# Combines momentum and mean reversion signalsPairs Trading Strategy:
cd Sample_Strategies/C#_Algorithms/PairsTrading
lean backtest
# Statistical arbitrage using correlated asset pairsSector Rotation Strategy:
cd Sample_Strategies/C#_Algorithms/SectorRotation
lean backtest
# Tactical asset allocation rotating between sector ETFsMarket Making Strategy:
cd Sample_Strategies/C#_Algorithms/MarketMaking
lean backtest
# Liquidity provision strategy that captures bid-ask spreadsCustom data integration:
# Use your downloaded data with LEAN
lean backtest --data-folder ./dataOptimization:
# Run parameter optimization
lean optimize --target "Sharpe Ratio" --target-direction maxCloud backtesting (requires QuantConnect subscription):
# Run backtest in the cloud with more data
lean cloud backtestResults analysis:
# Generate detailed backtest report
lean reportLaunch the interactive visualizer:
chmod +x launch_visualizer.sh
./launch_visualizer.shThis will start a Streamlit web application at http://localhost:8501 where you can:
- Upload and analyze backtest results
- View interactive charts and performance metrics
- Compare multiple strategies
- Export analysis reports
# Clone and setup AlgoForge
git clone https://github.com/FranklineMisango/AlgoForge.git
cd AlgoForge
# Install and configure LEAN CLI
pip install lean
lean login # Optional but recommended
lean init
# Setup data pipeline
cd data_pipeline
./setup.sh# Configure API keys in data_pipeline/.env
# Download market data
python main.py --source both --start-date 2020-01-01 --end-date 2024-12-31
# Verify data format for LEAN
ls -la data/ # Check downloaded data# Start with a sample strategy
cd Sample_Strategies/Python_Algorithms/DiversifiedLeverage
# Or create a new algorithm
lean create-project --language python "MyStrategy"
cd MyStrategy# Run backtest
lean backtest --verbose
# Generate detailed report
lean report
# Visualize results
cd ../../../
./launch_visualizer.sh# Optimize parameters
lean optimize --target "Sharpe Ratio" --target-direction max
# Test with different datasets
lean backtest --start 20180101 --end 20201231 # Bear market period
lean backtest --start 20200301 --end 20211231 # Bull market period# Deploy for paper trading first
lean live deploy --environment "paper"
# Deploy for live trading (requires broker integration)
lean live deploy --environment "live"AlgoForge/
├── README.md # This file
├── LICENSE # Project license
├── backtest_visualizer.py # Interactive visualization tool
├── launch_visualizer.sh # Visualizer launcher script
├── data_pipeline/ # Data acquisition and processing
│ ├── main.py # Main pipeline script
│ ├── alpaca_downloader.py # Alpaca data downloader
│ ├── binance_downloader.py # Binance data downloader
│ ├── polygon_downloader.py # Polygon data downloader
│ ├── databento_downloader.py # Databento data downloader
│ ├── config.py # Configuration settings
│ ├── setup.sh # Automated setup script
│ └── requirements.txt # Python dependencies
└── Sample_Strategies/ # Example trading strategies
├── Python_Algorithms/ # Python-based strategies
│ ├── DiversifiedLeverage/ # Diversified leverage ETF strategy
│ ├── BollingerBandsMeanReversion/ # Bollinger Bands mean reversion strategy
│ └── CryptoMomentum/ # Cryptocurrency momentum trading strategy
└── C#_Algorithms/ # C#-based strategies
├── DiversifiedLeverage/ # Diversified leverage ETF strategy
├── MomentumMeanReversion/ # Combined momentum and mean reversion strategy
├── PairsTrading/ # Statistical arbitrage pairs trading strategy
├── SectorRotation/ # Tactical asset allocation sector rotation strategy
└── MarketMaking/ # Liquidity provision market making strategy
Edit data_pipeline/config.py to customize:
- Default symbols to download
- Date ranges
- Data resolutions
- Output paths
- API rate limits
lean.json: Main LEAN configuration file created by lean init
{
"algorithm-type-name": "BasicTemplateAlgorithm",
"algorithm-language": "Python",
"algorithm-location": "main.py",
"data-folder": "./data",
"debugging": false,
"debugging-method": "PTVSD",
"log-handler": "ConsoleLogHandler",
"messaging-handler": "QueueHandler",
"job-queue-handler": "JobQueue",
"api-handler": "LocalDiskApiHandler",
"map-file-provider": "LocalDiskMapFileProvider",
"factor-file-provider": "LocalDiskFactorFileProvider",
"data-provider": "DefaultDataProvider",
"alpha-handler": "DefaultAlphaHandler",
"object-store": "LocalObjectStore",
"data-channel-provider": "DataChannelProvider"
}config.json: Algorithm-specific configuration for each strategy
{
"algorithm-language": "Python",
"parameters": {
"custom-parameter": "value"
},
"description": "Strategy description",
"local-id": 123456789
}Each sample strategy includes a config.json file where you can modify:
Common Parameters:
- Portfolio weights and rebalancing frequency
- Risk management parameters (stop-loss, take-profit)
- Technical indicator periods and thresholds
- Position sizing and exposure limits
- Backtesting date ranges
Strategy-Specific Examples:
Diversified Leverage: Portfolio weights, rebalancing periods Momentum Mean Reversion: RSI periods, moving average lengths, volatility thresholds Pairs Trading: Correlation thresholds, z-score entry/exit levels, lookback periods Sector Rotation: Momentum periods, sector allocation limits, market regime parameters Bollinger Bands: Band periods, standard deviations, volume confirmation settings Crypto Momentum: MACD parameters, momentum thresholds, trailing stop percentages Market Making: Spread targets, inventory limits, quote refresh intervals, volatility adjustments
Data Pipeline (.env):
# Alpaca API credentials
ALPACA_API_KEY=your_key_here
ALPACA_SECRET_KEY=your_secret_here
# Binance API credentials
BINANCE_API_KEY=your_key_here
BINANCE_SECRET_KEY=your_secret_here
# Polygon API credentials
POLYGON_API_KEY=your_key_here
# Databento API credentials
DATABENTO_API_KEY=your_key_hereLEAN Environment:
# QuantConnect credentials (optional)
QC_USER_ID=your_user_id
QC_API_TOKEN=your_api_token
# Custom data paths
LEAN_DATA_FOLDER=/path/to/data
LEAN_RESULTS_FOLDER=/path/to/resultsThe pipeline converts data to LEAN's standard format:
- Equity data:
YYYYMMDD HH:mm,open,high,low,close,volume - Crypto data:
YYYYMMDD HH:mm,open,high,low,close,volume - Tick data:
YYYYMMDD HH:mm:ss.fff,price,size(from Polygon/Databento) - Options data:
YYYYMMDD HH:mm,open,high,low,close,volume,open_interest - Compression: Automatic ZIP compression for storage efficiency
- Timezone: UTC for consistency across markets
python main.py [options]
Options:
--source {alpaca,binance,polygon,databento,all} Data source selection
--equity-symbols SYMBOL [SYMBOL ...] Equity symbols to download
--crypto-symbols SYMBOL [SYMBOL ...] Crypto symbols to download
--start-date YYYY-MM-DD Start date for data download
--end-date YYYY-MM-DD End date for data download
--resolution {tick,minute,hour,daily} Data resolution
--test Run with test dataProject Management:
lean create-project --language python "ProjectName" # Create new project
lean create-project --language csharp "ProjectName" # Create C# project
lean delete-project "ProjectName" # Delete projectData Management:
lean data download # Download sample data
lean data download --dataset "dataset-name" # Download specific dataset
lean data clear-cache # Clear data cache
lean data list # List available datasetsBacktesting:
lean backtest # Run backtest
lean backtest --verbose # Verbose logging
lean backtest --config config.json # Custom config
lean backtest --start 20200101 --end 20231231 # Custom date range
lean backtest --output results.json # Save resultsOptimization:
lean optimize # Run optimization
lean optimize --target "Sharpe Ratio" # Optimize specific metric
lean optimize --config optimization.json # Custom optimization configResearch:
lean research # Start Jupyter research environment
lean research --port 8888 # Custom portLive Trading (requires subscription):
lean live deploy # Deploy for live trading
lean live stop # Stop live trading
lean live status # Check statusCloud Features (requires subscription):
lean cloud backtest # Cloud backtest
lean cloud optimize # Cloud optimization
lean cloud live deploy # Cloud live tradingConfiguration:
lean config list # Show current config
lean config set key value # Set config value
lean config get key # Get config value
lean init # Initialize configuration
lean login # Login to QuantConnect
lean logout # LogoutUtilities:
lean --version # Show version
lean --help # Show help
lean doctor # Diagnose issues
lean logs # Show logs
lean report # Generate report from results- API Key Errors: Ensure your API keys are correctly set in the
.envfile - Rate Limiting: If you encounter rate limits, increase the delay in
config.py - Data Format Issues: Verify that downloaded data matches LEAN's expected format
- Permission Errors: Make sure scripts have execute permissions (
chmod +x) - LEAN CLI Issues: Run
lean doctorto diagnose common setup problems - Docker Issues: Ensure Docker is running for LEAN backtesting
- .NET Issues: Verify .NET 6.0 SDK is installed for C# algorithms
LEAN not found:
# Reinstall LEAN CLI
pip uninstall lean
pip install lean
# Verify installation
lean --versionDocker issues:
# Check Docker status
docker --version
docker ps
# Pull LEAN Docker image manually if needed
docker pull quantconnect/lean:latestConfiguration issues:
# Reset LEAN configuration
lean init --reset
# Check current configuration
lean config listData issues:
# Clear LEAN cache
lean data clear-cache
# Re-download sample data
lean data downloadBacktest failures:
# Run with verbose logging
lean backtest --verbose
# Check logs
lean logs- Check the detailed documentation in
data_pipeline/README.md - Review the setup guide in
data_pipeline/SETUP_GUIDE.md - Run the test setup script to verify your configuration
- Check the sample strategies for implementation examples
- Visit the QuantConnect Documentation for LEAN-specific help
- Use
lean --helpfor command-specific documentation
Contributions are welcome! Please feel free to submit pull requests, report bugs, or suggest new features.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the terms specified in the LICENSE file.
This software is for educational and research purposes only. Past performance does not guarantee future results. Always test strategies thoroughly before using real capital.
- QuantConnect for the LEAN algorithmic trading engine
- Alpaca Markets for providing free equity data API
- Binance for cryptocurrency market data
- The open-source community for various dependencies and tools