AI-powered cryptocurrency futures trading bot with strict anti-lookahead measures.
# Install
pip install -r requirements.txt
# Test Binance data (no API key needed!)
python main.py test-data
# Quick demo
python main.py demo
# Train model
python main.py train --candles 10000 --symbol BTCUSDT
# Backtest
python main.py backtest
# Full pipeline
python main.py full# Setup environment (first time only)
conda create -n crypto_bot python=3.10 -y
conda activate crypto_bot
pip install -r requirements.txt
# Submit job
mkdir -p logs
sbatch --nodes=1 train.slurm # 1 node = 4 GPUs
sbatch --nodes=2 train.slurm # 2 nodes = 8 GPUs
sbatch --nodes=12 train.slurm # Max scale = 48 GPUs
# Monitor
tail -f logs/crypto_train_*.out
squeue -u $USER
# Cancel if needed
scancel <JOB_ID>Real Binance data - no API keys required!
- Symbols:
BTCUSDT,ETHUSDT,BNBUSDT, etc. - Intervals:
1m,5m,15m,1h,4h,1d, etc. - Fetches via public endpoints
- Auto-pagination for 10k+ candles
- Fallback to synthetic data if API fails
- Causal Masking: Position i can ONLY see positions ≤ i (no future leakage)
- Realistic Execution: Trades at NEXT candle open + slippage + fees
- Temporal Splitting: Train → Val → Test (no overlap in time)
- HPC Ready: Multi-node distributed training with auto-resume
Edit config.py:
# Data
symbol = "BTCUSDT"
interval = "1h"
lookback_window = 168 # 7 days
use_real_data = True
# Model
d_model = 128
num_encoder_layers = 4
causal = True # KEEP THIS TRUE!
# Trading
initial_capital = 10000.0
position_size = 0.1 # 10% per trade
stop_loss_pct = 0.02 # 2%
min_confidence = 0.6| Metric | Random | Trained |
|---|---|---|
| Accuracy | ~33% | 40-50% |
| Win Rate | ~50% | 50-60% |
| Sharpe | ~0 | 0.5-1.5 |
# Custom training config
export N_CANDLES=20000
export BATCH_SIZE=128
export EPOCHS=100
export SYMBOL=ETHUSDT
sbatch train.slurm├── config.py # All settings
├── data.py # Binance fetcher + preprocessing
├── model.py # Causal Transformer
├── train.py # Training (DDP + AMP)
├── backtest.py # Backtesting engine
├── main.py # Entry point
├── train.slurm # HPC job script
└── requirements.txt # Dependencies
- ✅ Real Binance data (no API key)
- ✅ Multi-node distributed training
- ✅ Mixed precision (2x speedup)
- ✅ Auto-resume after preemption
- ✅ Causal masking verified
- ✅ Realistic backtesting
"No data fetched"
# Use fake data as fallback
python main.py train --fake-data"CUDA out of memory"
# Reduce batch size in config.py
batch_size = 32 # instead of 64"Job keeps failing on HPC"
# Check logs
cat logs/crypto_train_*.err
# Verify environment
conda activate crypto_bot
python -c "import torch; print(torch.cuda.is_available())"- Not financial advice - This is educational/research code
- No real trading - Use paper trading to validate
- Past performance ≠ future results
- API keys - Never commit to git (use env vars)
- Risk management - Always use stop losses
- Test locally:
python main.py demo - Train small:
python main.py train --candles 2000 - Validate backtest results
- If good → scale up on HPC
- Paper trade before considering real money
Causal Masking: Prevents model from "seeing the future"
# Position 5 can only see [0,1,2,3,4,5]
# Position 5 CANNOT see [6,7,8,...]Temporal Split: Data ordered by time
Train: Jan-Jun → Val: Jul-Aug → Test: Sep-Oct
Realistic Execution:
Signal at candle[i] → Execute at candle[i+1].open
| Issue | Solution |
|---|---|
| Import errors | pip install -r requirements.txt |
| No CUDA | Add --device cpu to commands |
| NCCL errors on HPC | Check SLURM logs, nodes may be down |
| Low accuracy | Normal! 40-50% is good for trading |
| Overfitting | Reduce model size or add dropout |
For speed:
- Enable AMP:
use_amp = True - Increase batch size
- Use multiple nodes
For accuracy:
- More data:
--candles 20000 - Tune hyperparameters
- Try different symbols/intervals
- Ensemble models
MIT - Use at your own risk
Ready to start?
python main.py test-data # Verify setup
python main.py demo # Quick test
python main.py train # Full training