- Research and production toolkit for ML-based trading strategies
- Designed for quants, data scientists, and ML engineers
- Time‑aware evaluation, configurable features, reproducible experiments
- 📊 Data: YFinance/AlphaVantage loaders, caching, validation
- 🧪 Features: technical indicators, custom signals, optional LLM sentiment
- 🤖 Models: ensemble VotingClassifier (LR, RF, XGBoost) with Optuna tuning
- 📈 Backtesting: execution costs, slippage, liquidity limits, adaptive impact, intrabar fills, borrow fees
- 🛡️ Risk management: drawdown and turnover guards
- 📟 Live trading: real-time position manager with intraday risk controls
- 🔌 Streaming: plugin-ready provider framework with automatic discovery and health monitoring
- 🛠️ CLI: end‑to‑end pipeline, evaluation, and model backtest in one place
- Install
git clone https://github.com/AKKI0511/QuantTradeAI.git
cd QuantTradeAI
poetry install- Run the pipeline
# Show commands
poetry run quanttradeai --help
# Fetch OHLCV for configured symbols
poetry run quanttradeai fetch-data -c config/model_config.yaml
# Train (features → CV tuning → model → artifacts)
poetry run quanttradeai train -c config/model_config.yaml- Evaluate and backtest a saved model
# Evaluate a trained model on current data
# (use a path under models/experiments/<timestamp>/<SYMBOL> or your own models/trained/<SYMBOL>)
poetry run quanttradeai evaluate -m models/experiments/<timestamp>/<SYMBOL> -c config/model_config.yaml
# Backtest a saved model on the configured test window (with execution costs and optional risk halts)
poetry run quanttradeai backtest-model -m models/experiments/<timestamp>/<SYMBOL> -c config/model_config.yaml -b config/backtest_config.yaml --risk-config config/risk_config.yamlArtifacts are written to:
models/experiments/<timestamp>/(models + results.json)reports/backtests/<timestamp>/<SYMBOL>/(metrics.json,equity_curve.csv,ledger.csv)
config/model_config.yaml: symbols, date ranges, caching, training, tradingconfig/features_config.yaml: pipeline steps, indicators, selection, sentimentconfig/backtest_config.yaml: execution costs, slippage, liquidityconfig/risk_config.yaml: drawdown protection and turnover limitsconfig/streaming.yaml: providers, auth, subscriptions (optional)config/position_manager.yaml: live position tracking and impact parameters
Pass --risk-config path/to/risk.yaml to poetry run quanttradeai backtest-model to enforce the configured drawdown guard during CLI backtests. If the file is omitted or missing, the backtest proceeds without halts.
Time‑aware evaluation rules:
- If
data.test_startanddata.test_endset: train = dates <test_start; test =test_start≤ dates ≤test_end - If only
data.test_startset: train = dates <test_start; test = dates ≥test_start - Otherwise: last
training.test_sizefraction is used chronologically (no shuffle)
See docs for details: Configuration Guide, Quick Reference.
poetry run quanttradeai fetch-data -c config/model_config.yaml # Download + cache data
poetry run quanttradeai train -c config/model_config.yaml # End-to-end training pipeline
poetry run quanttradeai evaluate -m <model_dir> -c config/model_config.yaml # Evaluate a saved model
poetry run quanttradeai backtest -c config/backtest_config.yaml # CSV backtest (uses data_path)
poetry run quanttradeai backtest-model -m <model_dir> -c config/model_config.yaml -b config/backtest_config.yaml --risk-config config/risk_config.yaml
poetry run quanttradeai validate-config # Preflight validation for all YAML configs
poetry run quanttradeai live-trade --url wss://example -c config/model_config.yamlfrom quanttradeai import DataLoader, DataProcessor, MomentumClassifier
loader = DataLoader("config/model_config.yaml")
processor = DataProcessor("config/features_config.yaml")
clf = MomentumClassifier("config/model_config.yaml")
data = loader.fetch_data()
df = processor.process_data(data["AAPL"]) # feature pipeline
df = processor.generate_labels(df) # forward returns → labels
X, y = clf.prepare_data(df)
clf.train(X, y)- Copy
.env.exampleto.envand fill values:
cp .env.example .envRequired vs optional keys:
- Required only if the related feature/provider is used.
- Examples:
- LLM Sentiment (provider-dependent):
OPENAI_API_KEY(required ifprovider: openai)ANTHROPIC_API_KEY(required ifprovider: anthropic)HUGGINGFACE_API_KEY(required ifprovider: huggingface)
- Streaming providers with
auth_method: api_key:- Alpaca:
ALPACA_API_KEY,ALPACA_API_SECRET(required if used)
- Alpaca:
- LLM Sentiment (provider-dependent):
Configure in config/features_config.yaml:
sentiment:
enabled: true
provider: openai
model: gpt-3.5-turbo
api_key_env_var: OPENAI_API_KEYExport the key and run the pipeline. A sentiment_score column is added when a text column exists. See docs/llm-sentiment.md.
- CLI:
poetry run quanttradeai live-trade -m models/experiments/<run>/AAPL --config config/model_config.yaml --streaming-config config/streaming.yaml - YAML‑driven gateway via
config/streaming.yaml:
streaming:
symbols: ["AAPL"]
providers:
- name: "alpaca"
websocket_url: "wss://stream.data.alpaca.markets/v2/iex"
auth_method: "api_key"
subscriptions: ["trades", "quotes"]
buffer_size: 1000
reconnect_attempts: 3
health_check_interval: 30from quanttradeai.streaming import StreamingGateway
gw = StreamingGateway("config/streaming.yaml")
gw.subscribe_to_trades(["AAPL"], lambda m: print("TRADE", m))
# gw.start_streaming() # blocking- Provider adapters are discovered dynamically via
quanttradeai.streaming.providers.ProviderDiscovery, validated withProviderConfigValidator, and monitored throughProviderHealthMonitor. See docs/api/streaming.md for detailed provider configuration and health tooling. - The live trading pipeline (
quanttradeai.streaming.live_trading.LiveTradingEngine) combines the streaming gateway, feature generation, model inference, risk controls, and optional health API. Use--health-api trueto expose/healthand/metricswhile streaming.
Enable advanced monitoring by adding a streaming_health section to your config and,
optionally, starting the embedded REST server:
streaming_health:
monitoring:
enabled: true
check_interval: 5
metrics:
enabled: true
host: "0.0.0.0"
port: 9000
thresholds:
max_latency_ms: 100
min_throughput_msg_per_sec: 50
max_queue_depth: 5000
alerts:
enabled: true
channels: ["log", "metrics"]
escalation_threshold: 3
api:
enabled: true
host: "0.0.0.0"
port: 8000Query live status while streaming:
curl http://localhost:8000/health # readiness probe
curl http://localhost:8000/status # detailed metrics + incidents
curl http://localhost:8000/metrics # Prometheus scrapeCommon patterns:
- Tune
escalation_thresholdto control alert promotion. - Increase
max_queue_depthin high-volume environments. - Set
circuit_breaker_timeoutto avoid thrashing unstable providers. - Run the standalone metrics exporter (default
0.0.0.0:9000) when you want Prometheus scraping without enabling the FastAPI health server; if both are enabled on the same host/port, the health API continues to serve/metricsand the exporter stays disabled to avoid port collisions.
quanttradeai/ # Core package
├─ data/ # Data sources, loader, processor
├─ features/ # Technical & custom features
├─ models/ # MomentumClassifier & utilities
├─ backtest/ # Vectorized backtester + metrics
├─ trading/ # Risk & portfolio management
├─ streaming/ # WebSocket gateway
├─ utils/ # Config schemas, metrics, viz
config/ # YAML configs (model, features, backtest, streaming)
docs/ # Guides, API, examples
tests/ # Pytest suite mirroring package
poetry install --with dev
make format # Black
make lint # flake8
make test # pytestContribution guide: CONTRIBUTING.md
- Streaming hardening: reconnection, health checks, metrics, provider adapters
- Backtesting realism: market impact knobs, borrow fees, intrabar fills, richer ledgers
- Risk & portfolio: position sizing improvements, exposure/turnover limits, drawdown guards
- Multi‑timeframe groundwork: intraday + daily pipelines with safe, time‑aware resampling
- Automated feature discovery/selection; regime detection and cross‑asset features
- Enhanced labeling and calibration; probability thresholds and evaluation curves
- LLM sentiment expansion: provider templates, caching, news/transcripts ingestion
- Parallel/distributed training and backtests across assets; memory efficiency
- Artifact management & experiment tracking; fully reproducible pipelines
- Low‑latency inference path for streaming signals
- Broker/exchange adapters, order routing, pre‑trade risk checks, fail‑safes
- Paper trading sandbox, live dashboards, incident logging & alerting
- Containerized jobs & scheduling, remote storage, model registry
- Service decomposition for streaming, inference, and backtesting
- GPU acceleration for training loops
- Reinforcement learning strategy research
- Multi‑modal data sources
MIT © Contributors — see LICENSE