July Backtester

A professional-grade Python engine for stress-testing US equity strategies with Monte Carlo noise and Walk-Forward Analysis.

Feature	Description
📥 Multi-Source Data	Polygon, Norgate, Yahoo Finance, & native CSV support
⚙️ Strategy Engine	Modular plugin architecture with 35+ pre-built signals
📊 Advanced Risk Analytics	PDF tearsheets with SQN, R-Multiple, and Underwater plots

What This Tool Does

The backtester takes a strategy that you create (e.g., "buy when the 20-day SMA crosses above the 50-day SMA"), simulates it against historical price data for one or many stocks, and produces a performance report. You can run a single strategy or sweep many strategies simultaneously to find what works.

Two modes:

Single-Asset Mode — Tests all strategies against one or a small list of specific tickers (e.g., just AAPL or BITB). Good for deep-diving a specific stock. Set symbols_to_test in config.py and run python main.py.
Portfolio Mode — Tests strategies against an entire index or portfolio (e.g., every stock in the Nasdaq 100). Runs in parallel across all your CPU cores. This is the primary research tool. Set portfolios in config.py and run python main.py.

Both modes are accessed through the single entry point main.py. Portfolio mode is the default.

What you get out:

Total P&L %, Max Drawdown, Sharpe Ratio, Calmar Ratio, Win Rate, Profit Factor
Performance vs SPY Buy & Hold and QQQ Buy & Hold
Monte Carlo robustness score (1,000 simulations per strategy to test if results are due to luck)
Per-run output folder with logs, trade CSVs, and analyzer-ready files
Optional: detailed PDF/Markdown reports via report.py, S3 uploads

Prerequisites

Before starting, you will need:

Python 3.10 or higher — Download here
A Polygon.io account — Sign up here. A paid plan is required for full historical data (the free tier is limited to 2 years of daily data). The Stocks Starter plan (~$29/month) covers most use cases.

For Norgate users: If you have a Norgate Data subscription and the Norgate Data Updater installed locally, you can use Norgate as the data provider instead of Polygon. See Data Provider Settings below.

Installation

Step 1 — Clone the Repository

git clone <repository-url>
cd july-backtester

Step 2 — Create a Python Virtual Environment

A virtual environment keeps this project's dependencies isolated from your system Python. This is strongly recommended.

# Create the virtual environment
python -m venv venv

# Activate it — macOS / Linux
source venv/bin/activate

# Activate it — Windows (Command Prompt)
venv\Scripts\activate.bat

# Activate it — Windows (PowerShell)
venv\Scripts\Activate.ps1

You should see (venv) appear at the start of your terminal prompt. Every time you open a new terminal to use this tool, you need to activate the virtual environment again before running any commands.

Step 3 — Install Dependencies

pip install -r requirements.txt

This installs: pandas, numpy, tqdm, boto3 (S3 uploads only), requests, python-dotenv, pandas-ta, orjson, pyarrow.

API Key Setup

The backtester reads your Polygon.io API key directly from a .env file or environment variable — no AWS configuration required.

Get your Polygon.io API key from https://polygon.io/dashboard/api-keys
Copy .env.example to .env in the project root:
```
cp .env.example .env
```
Open .env and add your key:
```
POLYGON_API_KEY=your_key_here
```
That's it. The backtester reads POLYGON_API_KEY at runtime. No changes to config.py are needed for the default setup.

Your .env file is gitignored and will never be committed. If you prefer not to use a .env file, you can also set POLYGON_API_KEY as a standard system environment variable and the tool will pick it up automatically.

(Optional) Set Up an S3 Bucket for Reports

If you want reports automatically uploaded to S3 after each run:

Create an S3 bucket in the AWS Console (e.g., my-backtester-reports) and make note of its name.
Ensure your environment has AWS credentials configured (via ~/.aws/credentials or environment variables AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY).
Update config.py:

"upload_to_s3": True,
"s3_reports_bucket": "my-backtester-reports",

S3 uploads are entirely optional. If upload_to_s3 is False or s3_reports_bucket is empty, all output is saved locally and no AWS connection is attempted. boto3 is only used for this S3 upload step — it is not involved in API key management.

Configuration

All settings live in one file: config.py. Open it in any text editor before running. The file is organized into labeled sections with comments explaining each setting.

Quick Setup Checklist

Add POLYGON_API_KEY to your .env file (copy .env.example to get started)
Set upload_to_s3 and s3_reports_bucket if you want S3 uploads (optional)
Choose data_provider: "polygon", "norgate", "yahoo", or "csv"
Set start_date and initial_capital
For portfolio mode: uncomment the portfolios you want in the portfolios dict
For single-asset mode: set symbols_to_test

Data Provider Settings

Four data providers are supported. Set data_provider in config.py:

"data_provider": "polygon",   # Polygon.io — API key via .env (default)
# "data_provider": "norgate", # Norgate Data — requires local Norgate installation
# "data_provider": "yahoo",   # Yahoo Finance via yfinance (free, no API key needed)
# "data_provider": "csv",     # Local CSV files (see CSV Data Provider section below)

Polygon.io (default)

Requires a Polygon.io account and API key set in .env as POLYGON_API_KEY. A paid plan is needed for full historical data. See API Key Setup above.

Norgate Data

Requires a Norgate Data subscription and the Norgate Data Updater installed locally. No API key needed.

Yahoo Finance

Uses the yfinance library. No API key or account required. Provides free adjusted daily data for most US equities and ETFs.

"data_provider": "yahoo",

yfinance is already included in requirements.txt — no additional setup needed. Note that Yahoo Finance data quality and availability varies; it is best suited for exploratory backtests rather than production research.

Index ticker translation. Polygon and Norgate use an I: prefix for index symbols (e.g. I:VIX, $I:VIX). Yahoo Finance uses ^ (e.g. ^VIX). The service translates these automatically — you do not need to change anything in config.py or your portfolio lists. The mapping for the most common indices is:

Norgate/Polygon symbol	Yahoo Finance symbol	Description
`I:VIX` / `$I:VIX`	`^VIX`	CBOE Volatility Index
`I:TNX` / `$I:TNX`	`^TNX`	10-Year Treasury Yield
`I:TYX` / `$I:TYX`	`^TYX`	30-Year Treasury Yield
`I:IRX` / `$I:IRX`	`^IRX`	13-Week Treasury Bill
`I:SPX` / `$I:SPX`	`^GSPC`	S&P 500 Index
`I:NDX` / `$I:NDX`	`^NDX`	Nasdaq 100 Index
`I:DJI` / `$I:DJI`	`^DJI`	Dow Jones Industrial Average
`I:RUT` / `$I:RUT`	`^RUT`	Russell 2000

For any unmapped I:XYZ symbol the service falls back to ^XYZ automatically.

CSV Data Provider

Reads historical OHLCV data from local CSV files. Useful for custom feeds, proprietary data, or offline use.

"data_provider": "csv",
"csv_data_dir": "csv_data",   # folder containing CSV files (relative to project root)

File naming: one file per symbol, named {SYMBOL}.csv (case-insensitive). Example: csv_data/AAPL.csv or csv_data/aapl.csv.

Symbols with special characters: Windows does not allow colons or other special characters in filenames. Symbols that contain illegal characters (e.g. I:VIX, $I:TNX) have those characters replaced with underscores when constructing the filename. The mapping rule is: replace each of \ / : * ? " < > | with _. Examples:

Symbol passed to the backtester	Expected CSV filename
`I:VIX`	`I_VIX.csv`
`I:TNX`	`I_TNX.csv`
`$I:VIX`	`$I_VIX.csv`
`AAPL`	`AAPL.csv` (unchanged)

Required CSV schema (column names are case-insensitive):

Column	Aliases accepted	Notes
`Date`	`date`, `datetime`, `timestamp`, `time`	Any pandas-parseable date or datetime string
`Open`	`open`	Numeric
`High`	`high`	Numeric
`Low`	`low`	Numeric
`Close`	`close`, `close/last`, `adj close`, `adjusted close`	Numeric. `Adj Close` and `Close/Last` (Nasdaq format) are silently treated as `Close`.
`Volume`	`volume`	Numeric

Extra columns (e.g. VWAP, Turnover) are silently ignored. The date column may be a named column or the CSV index. Multiple date formats are supported (ISO YYYY-MM-DD, US MM/DD/YYYY, datetime strings with time components, etc.).

Nasdaq-format CSVs are natively supported. Files downloaded directly from nasdaq.com use Close/Last as the price column header and prefix all price values with $ (e.g. $264.72). Both are handled automatically: Close/Last is remapped to Close, and $ signs and , thousands separators are stripped before numeric conversion. No pre-processing of the file is required.

Mandatory benchmark files: The backtester fetches four symbols at startup — before any portfolio simulation begins — to calculate SPY/QQQ buy-and-hold baselines, VIX regime filters, and TNX data used by certain strategies. These files must be present in your csv_data_dir regardless of which portfolio or single ticker you are testing. Missing any one of them causes an immediate fatal crash at startup.

Required file	Symbol	Purpose
`SPY.csv`	`SPY`	SPY buy-and-hold benchmark + regime reference
`QQQ.csv`	`QQQ`	QQQ buy-and-hold benchmark
`I_VIX.csv`	`I:VIX`	VIX regime filter (strategies that use `vix` dependency)
`I_TNX.csv`	`I:TNX`	10-Year Treasury Yield (strategies that use `tnx` dependency)

Minimum data requirements (crucial for CSVs): When supplying your own CSV files there are two hard limits that will cause a symbol to be silently skipped if not met.

250-bar minimum. The backtester has a built-in safety check that automatically skips any symbol whose CSV contains fewer than 250 bars (roughly one calendar year of daily data). No error is raised — the symbol simply produces no results. If you notice a symbol missing from your output, a CSV that is too short is the most likely cause.
Indicator warm-up. Even if a CSV passes the 250-bar check, long-lookback strategies need additional bars just to calculate their first signal. The default 50d/200d SMA Crossover strategy, for example, cannot fire a single trade until at least 200 daily bars have accumulated. A CSV that covers only a few months will pass the minimum check but still produce zero trades because the moving average never finishes warming up.

Recommendation: When downloading historical data from Nasdaq, Yahoo Finance, or any other source, always request at least 3–5 years of daily bars. This gives every default strategy enough runway to warm up its indicators and execute a meaningful number of simulated trades.

Backtest Period

"start_date": "2004-01-01",                            # How far back to test
"end_date": datetime.now().strftime('%Y-%m-%d'),        # Defaults to today

Setting start_date to a date earlier than the provider's available data is fine — the tool will use whatever the earliest available bar is.

Capital and Position Sizing

"initial_capital": 100000.0,   # Simulated account size in dollars
"allocation_per_trade": 0.10,  # 10% of equity per new position (allows up to 10 concurrent)

What to Test

Single-Asset Mode:

"symbols_to_test": ['AAPL'],                      # One ticker
"symbols_to_test": ['AAPL', 'TSLA', 'NVDA'],      # Several tickers

Portfolio Mode — edit the portfolios dictionary in config.py. Comment out entries you do not want to run:

"portfolios": {
    "Nasdaq 100": "nasdaq_100.json",                      # Pre-built index list (~100 symbols)
    # "Nasdaq": "nasdaq.json",                            # Full Nasdaq (~3,000 symbols — slow)
    # "SP 500": "sp-500.json",                            # S&P 500
    "My Watchlist": ["AAPL", "MSFT", "GOOGL", "AMZN"],   # A manual list
},

Start small. Running the full Nasdaq can take 30–60 minutes on the first run (data fetching). Use "Nasdaq 100" to validate your setup first.

Norgate Watchlists: If you use Norgate as your data provider, you can reference watchlists by name directly without creating a JSON file:

"portfolios": {
    "Nasdaq Biotechnology": "norgate:Nasdaq Biotechnology",
},

Stop Loss Configuration

"stop_loss_configs": [
    {"type": "none"},                                   # No stop — hold until signal reverses
    # {"type": "percentage", "value": 0.05},            # 5% fixed stop below entry
    # {"type": "atr", "period": 14, "multiplier": 3.0}, # 3x ATR trailing stop
],

Include multiple entries to test each strategy with each stop type in the same run. Be aware that each additional stop type multiplies the total number of simulations.

Slippage and Commission

"slippage_pct": 0.0005,          # 0.05% flat slippage per fill (5 basis points)
"commission_per_share": 0.002,   # $0.002 per share commission
"max_pct_adv": 0.05,             # cap position at 5% of 20-day average daily volume
"volume_impact_coeff": 0.0,      # market impact coefficient (0.0 = disabled)

There are three independent cost controls:

Control	Key	Models
Flat slippage	`slippage_pct`	Bid/ask spread cost on every fill, regardless of size
Position size cap	`max_pct_adv`	Prevents unrealistically large orders by capping shares at X% of ADV — does not change cost
Market impact	`volume_impact_coeff`	Square-root price impact: `coeff × sqrt(shares / adv_20)`. Larger orders relative to ADV incur more slippage. `0.0` = disabled (default). `0.1` = mild (institutional). `0.5` = aggressive (small-cap).

The market impact formula — a square-root model widely used in academic market microstructure — recognises that consuming 5% of ADV moves the price more than consuming 0.1% of ADV. Example at coeff=0.1: 1% ADV order → +1 bp impact; 5% ADV order → +2.2 bp impact.

Output Filters

These control what appears in the printed summary table and which trade logs get saved. Setting any to -9999 effectively disables that filter (shows everything).

"mc_score_min_to_show_in_summary": 3,       # Only show strategies with MC score >= 3
"min_pandl_to_show_in_summary": 5.0,        # Only show strategies with P&L >= 5%
"max_acceptable_drawdown": 0.30,            # Only show strategies with max DD <= 30%
"min_performance_vs_spy": 0.0,              # Only show strategies that beat SPY buy-and-hold
"save_only_filtered_trades": False,         # If True, only saves trades for filtered strategies

Running the Backtester

Make sure your virtual environment is activated (source venv/bin/activate or the Windows equivalent) before running.

First time? Run the setup wizard before anything else:
python main.py --init
The wizard walks you through provider selection, API key setup, capital/dates, and symbol choice, then writes a ready-to-use config_starter.py. Rename it to config.py and you're ready to run.

CLI Flags

Flag	Description
(none)	Full backtest run
`--init`	Launch the first-time setup wizard
`--dry-run`	Validate config and print run summary without fetching data
`--name <label>`	Prefix the output folder with a custom label

Portfolio Mode (Primary Use)

Tests all active strategies in custom_strategies/ against all uncommented portfolios in config.py. Uses all CPU cores.

python main.py

With an optional run name (added as a prefix to the output folder):

python main.py --name "nasdaq-sma-sweep"

Dry Run — Validate Without Fetching Data

Runs all startup checks (API key, config validation) and prints the run summary without fetching any market data or running simulations. Use this to confirm your configuration — portfolio sizes, strategy count, and total task estimate — before committing to a long run.

python main.py --dry-run

# Combine with --name to preview the run ID that will be used
python main.py --dry-run --name "my-next-run"

Single-Asset Mode

Tests strategies against the symbols listed in symbols_to_test in config.py. Update that list first, then run the same entry point:

python main.py

To use single-asset mode, set symbols_to_test in config.py and make sure the portfolios dict only contains the symbols you want (or wrap them as a portfolio entry like "My Tickers": ["AAPL", "TSLA"]).

First Run Tips

Validate with one symbol first. Set "portfolios": {"Test": ["SPY", "QQQ"]} and confirm the run completes without errors before testing larger lists.
Watch for API key errors. If you see Could not find 'POLYGON_API_KEY' in the terminal, your .env file is missing or the variable name doesn't match.
The first run is the slowest. Data is fetched from Polygon and cached locally. Subsequent runs within 24 hours load from disk and are much faster.

Understanding the Output

Run Output Folder

Every backtest creates a timestamped folder under output/runs/:

output/
└── runs/
    └── <run_id>/                        # e.g. 2026-03-02_15-12-32 or myname_2026-03-02_15-12-32
        ├── logs/                        # Execution log: run_<timestamp>.log
        ├── raw_trades/                  # Per-portfolio raw trade CSVs (when save_individual_trades=True)
        │   └── <Portfolio_Name>/
        ├── analyzer_csvs/               # Renamed + column-mapped CSVs ready for report.py
        │   └── <Portfolio_Name>/
        ├── detailed_reports/            # PDFs / Markdown generated by report.py
        ├── config_snapshot.json         # Copy of config.py settings used for this run
        ├── overall_portfolio_summary.csv
        └── ml_features.parquet          # ML-ready trade export (when export_ml_features=True)

The entire output/ directory is gitignored. Each run is isolated in its own folder — previous runs are never overwritten. S3 uploads (if configured) mirror this same <run_id>/ structure as the key prefix.

Run Summary Box

At startup, the backtester prints a summary box to the log / terminal before fetching any data:

============================================================
  RUN SUMMARY
============================================================
  Run ID            : 2026-03-10_14-22-01
  Data provider     : yahoo
  Period Selected   : 2004-01-01 -> 2026-03-10
  Timeframe         : D x 1
  Strategies        : 22
  Stop configs      : 1
------------------------------------------------------------
  Portfolio         : Nasdaq 100 (101 symbols)
------------------------------------------------------------
  Total symbols     : 101
  Total tasks       : 2222  (symbols x strategies x stop configs)
============================================================

After benchmark data (SPY) has been fetched, a second line is logged:

  Actual Data Period : 2004-01-02 -> 2026-03-07  (via SPY)

Period Selected is exactly what you configured in config.py. Actual Data Period is the real date range returned by the data provider for SPY — this is the ground truth for how far back your strategy results are calculated. The two values differ whenever:

The data provider does not have data going back to your requested start_date (e.g. free-tier API limits, or a ticker that was listed later)
The end_date falls on a weekend or holiday, so the last available bar is a trading day before it

Terminal Summary Table

After each portfolio finishes, a results table is printed to the terminal:

Strategy                      P&L (%)  vs. SPY  Max DD  Calmar  Sharpe  Win Rate  Trades  MC Verdict   MC Score
SMA Crossover (20d/50d)        +89.4%   +21.3%   41.5%    1.12    0.71     48.9%     287   Mod. Tail Risk     2
SMA Crossover (50d/200d)       +74.1%   +12.1%   38.2%    0.98    0.65     46.1%     214   Good               3

Column definitions:

Column	Meaning
P&L (%)	Total return over the full backtest period
vs. SPY / vs. QQQ	Outperformance vs buy-and-hold of those indices
Max DD	Largest peak-to-trough decline during the period
Max Rcvry (d)	Longest calendar-day gap from any drawdown trough back to the prior equity peak. `N/A` if the curve ends in an open drawdown.
Avg Rcvry (d)	Mean calendar days across all completed recoveries. `N/A` if the curve ends in an open drawdown.
Calmar	Annualized return divided by max drawdown (higher = better risk-adjusted return)
Sharpe	Risk-adjusted return relative to volatility (above 1.0 is generally considered good; above 2.0 is strong)
Roll.Sharpe(avg)	Mean of all 126-day rolling Sharpe windows — regime-averaged quality
Roll.Sharpe(min)	Worst 126-day rolling Sharpe — reveals if there was a prolonged losing streak even when overall Sharpe looks healthy
Roll.Sharpe(last)	Most recent 126-day rolling Sharpe — current momentum signal
Win Rate	Percentage of trades that were profitable
Trades	Total number of completed trades
Expectancy (R)	Average R-Multiple per trade — how many R the strategy earns per unit risked (see below)
SQN	System Quality Number — statistical confidence in the edge (see below)
WFA Verdict	Single-split Walk-Forward Analysis pass/fail verdict
Rolling WFA	Rolling k-fold WFA verdict — `Pass (K/N)`, `Fail (K/N)`, or `N/A`. Only present when `wfa_folds` is set.
MC Verdict	Robustness classification from Monte Carlo analysis
MC Score	Numeric robustness score (see below)
VolumeImpact_bps	Total market impact cost in basis points (entry + exit). Only present in trade CSVs when `volume_impact_coeff > 0`.

Core Metrics Glossary

A quick-reference for the key derived metrics produced by the engine. Detailed explanations follow in the sections below.

Metric	Definition	Why it's useful
Expectancy (R)	Average R-Multiple per trade.	Answers: "On average, how many units of risk do I earn per trade?"
SQN	System Quality Number — `(Expectancy / StdDev(R)) × √N`.	A score from Van Tharp measuring system quality; 2.5+ is Good, 3.0+ is Excellent.
Annual Turnover	`(Σ(entry_price × shares) / initial_capital) / years × 100`.	Measures how many times the full portfolio is recycled per year.
After-Tax CAGR	CAGR calculated after a flat 30% tax haircut on net profits.	Provides a realistic "take-home" performance comparison against gross benchmarks.

Additional Metrics in the PDF Report

The Overall Performance Metrics page of the PDF tearsheet includes two additional derived metrics not shown in the terminal table:

Annual Turnover % — (Σ(entry_price × shares) / initial_capital) / years × 100. Measures how many times the full portfolio is recycled per year. A turnover of 200% means the equivalent of the entire account was deployed twice over. Requires Price and Shares columns in the trade data; shows N/A otherwise.
Estimated After-Tax CAGR (30% tax) — applies a flat 30% short-term capital gains rate to any net profit before computing CAGR. Formula: after_tax_equity = initial_capital + max(net_profit, 0) × 0.70 + min(net_profit, 0). Losses are carried through unchanged (no tax benefit assumed). Placed directly below the standard CAGR line for easy comparison.

Monte Carlo Score Explained

Every strategy with 50+ trades is stress-tested with 1,000 simulations that randomly reshuffle the historical trade sequence. This reveals whether results depend on lucky ordering or are genuinely robust.

Sampling methods (controlled by mc_sampling in config):

Method	Config value	Description
i.i.d. (default)	`"iid"`	Each trade is resampled independently. Fast and statistically standard. Assumes no autocorrelation between trades.
Block-bootstrap	`"block"`	Consecutive blocks of trades are sampled as a unit, preserving win/loss streaks and regime clustering. Recommended when the Regime Heatmap shows the strategy only loses in one VIX bucket. Auto block size = `floor(sqrt(N))` (Politis-Romano rule of thumb).

Score	Verdict	What It Means
5	Robust	Consistent across simulations. Results are likely genuine.
3–4	Good	Solid with minor concerns. Worth investigating further.
1–2	Moderate	Some robustness concerns. Proceed with caution.
≤ 0	Weak / Perf. Outlier	Results may be overfitted or luck-dependent.

Warning flags in MC Verdict:

Perf. Outlier — The historical return was worse than 95% of simulations, meaning the actual results are below what random sampling would expect. Investigate why.
DD Understated — Historical drawdowns were better than median simulations. The backtest period may have been unusually favorable.
Moderate Tail Risk — Worst-case simulations show 50–80% drawdown potential.
High Tail Risk — Worst-case simulations show >80% drawdown potential. High risk of ruin.

Walk-Forward Analysis (WFA)

Important

Why WFA? A strategy optimised on the same data it's being measured on is like studying the answer key before a test — it will look great, but fail in the real world. WFA holds back the most recent slice of data during strategy development, then checks if the edge still holds on that unseen period. A strategy that passes both IS and OOS is far more likely to be genuinely robust.

Every strategy result includes two WFA columns alongside the Monte Carlo output:

Column	Meaning
OOS P&L (%)	Total P&L earned in the Out-of-Sample window as a percentage of initial capital
WFA Verdict	Pass / Likely Overfitted / N/A

How the split works: The backtester uses the actual data period (as reported by SPY) — not the configured start_date — to compute the IS/OOS boundary. With the default wfa_split_ratio: 0.80, the first 80% of that period is In-Sample (IS) and the final 20% is Out-of-Sample (OOS). A strategy tested over 20 years of data would have 16 years of IS history and 4 years of OOS history.

Verdict logic:

Pass — OOS performance does not show signs of overfitting.
Likely Overfitted — Either the IS period is profitable but the OOS period is a net loss (sign flip), or the OOS annualised return has degraded by more than 75% relative to the IS annualised return.
N/A — WFA is disabled (wfa_split_ratio is None or 0), or the OOS window contains fewer than 5 completed trades (insufficient data for a meaningful verdict).

Disabling WFA: Set "wfa_split_ratio": None (or 0) in config.py. Both OOS P&L (%) and WFA Verdict will show N/A for all strategies.

Rolling Multi-Fold WFA (opt-in)

For a more rigorous overfitting check, enable rolling k-fold WFA by setting wfa_folds to an integer ≥ 2 in config.py. This is independent of wfa_split_ratio — both can be active simultaneously.

"wfa_folds": 5,           # divide the period into 5 equal OOS windows
"wfa_min_fold_trades": 5, # skip folds with fewer than 5 OOS trades

How it works: The full period is split into k equal-width windows. For fold i, the IS window is everything before that fold's start date and the OOS window is the fold itself. Each fold is scored independently using the same Pass / Likely Overfitted logic as the single-split WFA. A fold with fewer than wfa_min_fold_trades OOS trades is skipped (not counted).

Rolling WFA column verdict:

Verdict	Meaning
`Pass (K/N)`	≥ 60% of scorable folds pass individually (K = passing folds, N = total scorable folds)
`Fail (K/N)`	< 60% of scorable folds pass
`N/A`	Fewer than 2 folds had enough trades to score, or `wfa_folds` is not set

R-Multiple, Expectancy, and SQN

Tip

Why SQN? While P&L tells you how much you made, SQN tells you how much you can actually trust your strategy's consistency. It penalises volatile strategies and rewards consistent ones — a system earning 1R every trade scores higher than one that randomly earns 10R then loses 8R. Meanwhile, Expectancy answers the simpler question: "On average, how many units of risk do I earn per trade?" A strategy with 40% win rate but an average winner of 3R and average loser of −1R is far superior to a 60% winner that earns only 0.5R per win.

How R-Multiple is calculated:

Initial Risk (per share) — captured at trade entry: entry_price − initial_stop_loss_price
- If no stop loss is configured, a 1% proxy is used: entry_price × 0.01
- The initial stop is frozen at entry; trailing-stop updates do not affect it.
R-Multiple — calculated at trade close: net_pnl / (initial_risk_per_share × shares)
- A trade that earns exactly 1× the amount risked = 1R
- A trade that loses the full stop = −1R
- Both InitialRisk and RMultiple are written to every row of the trade CSV.

Expectancy (Avg. R per trade):

Expectancy = mean(all R-Multiples)

This answers: "On average, how many R do I earn per trade?" Positive is good. A value above 0.5R is generally considered a solid edge.

SQN (System Quality Number):

SQN = (Expectancy / StdDev(R-Multiples)) × √N

Developed by Van Tharp. It normalises expectancy by the consistency of the R distribution and scales with sample size. Rule of thumb:

SQN	Quality
< 1.6	Poor — not tradeable
1.6 – 1.9	Below average
2.0 – 2.4	Average
2.5 – 2.9	Good
3.0 – 5.0	Excellent
> 5.0	Holy Grail (verify for overfitting)

Both Expectancy (R) and SQN show N/A for strategies with fewer than 2 completed trades.

PDF report: when a strategy CSV contains an RMultiple column, the detailed report includes a purple histogram of the full R distribution with a red breakeven line (0R) and a green expectancy line annotated with Expectancy, SQN, and trade count.

Price Noise Injection (Stress Testing)

Important

Why stress test? A strategy that only works on perfectly clean historical prices is brittle. Real-world data contains bid/ask spread noise, stale quotes, and data-vendor rounding errors. Injecting a small amount of random noise before running the backtest reveals whether your edge survives minor perturbations — a robust strategy should still pass WFA and Monte Carlo checks even with ±1–2% noise applied per bar.

Enable noise injection in config.py:

"noise_injection_pct": 0.01,  # ±1% uniform noise per OHLC cell per bar

How it works:

For each bar and each of Open, High, Low, and Close, an independent multiplier drawn from Uniform(1 - noise_pct, 1 + noise_pct) is applied. After perturbation, High and Low are reconstructed as the row-wise maximum and minimum across all four price columns — this guarantees that no candlestick becomes invalid (High < Low or price order violations). Volume and the date index are never touched.

Terminal warning when enabled:

************************************************************
  [STRESS TEST MODE] Injecting 1.0% random noise into OHLC price data
  High/Low bounds are enforced after noise — no invalid candlesticks
************************************************************

Typical usage:

`noise_injection_pct`	Meaning
`0.0` (default)	Disabled — clean historical prices used
`0.005`	±0.5% noise — very mild perturbation
`0.01`	±1% noise — recommended starting point for robustness checks
`0.02`	±2% noise — aggressive; strategies with thin edges will fail

What to look for: If a strategy's WFA verdict flips from Pass → Likely Overfitted with noise enabled, it was curve-fitted to specific price levels. Discard it or widen entry/exit conditions.

Strategy Correlation Matrix

Tip

Why check correlation? Running two highly correlated strategies is effectively doubling your position in a single edge — they will win and lose together, offering no diversification benefit. The correlation analysis automatically surfaces these overlaps after every portfolio run so you can prune redundant strategies before live trading.

After each portfolio finishes, the backtester computes pairwise Pearson correlations between all strategies based on their daily realised P&L. The result is saved as a CSV, an Avg. Corr column appears in the terminal summary table, and any pairs above the threshold trigger a prominent alert.

How it works: Each strategy's trade log is aggregated into a daily P&L series (trades grouped by exit date, profits summed). These series are aligned into a date x strategy matrix, with missing dates filled with 0. Pearson correlations are then computed on that combined matrix.

Output file location:

output/runs/<run_id>/<Portfolio_Name>_strategy_correlation.csv

For example: output/runs/2026-03-10_14-22-01/Nasdaq_100_strategy_correlation.csv

The CSV has strategy names as both row index and column headers, values rounded to 4 decimal places.

Avg. Corr column in the summary table:

Each strategy shows its mean absolute Pearson correlation against all other strategies in the run. Strategies with any pairwise correlation above the threshold are flagged with * (e.g. 0.81*).

How to Interpret:

Range	Meaning	Action
0.70 - 1.00	High Overlap (Red Flag)	Strategies enter/exit at nearly the same times. Remove one unless they differ in risk/sizing.
0.30 - 0.70	Moderate Overlap	Some shared signal; acceptable if each strategy has independent edge.
0.00 - 0.30	High Diversification (Goal)	Strategies behave independently -- ideal portfolio composition.

Terminal alert example:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
  HIGH CORRELATION ALERT  |  Portfolio: Nasdaq 100
  Threshold: |r| > 0.70 -- strategies below may overlap significantly
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    'SMA Crossover (20d/50d)' <-> 'EMA Crossover (Unfiltered)'  r=+0.91  [HIGH OVERLAP -- consider removing one]
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

The default threshold is 0.70 (absolute value). Pairs with |r| > 0.70 are flagged as HIGH CORRELATION ALERT warnings.

When the matrix is not generated: If fewer than 2 strategies have completed trades in a portfolio, correlation analysis is silently skipped and no CSV is written.

Parameter Sensitivity Sweep

Important

Why sweep parameters? Backtests are vulnerable to p-hacking — an intern (or an experienced analyst) who tweaks a single parameter until the equity curve looks great has almost certainly overfit to historical noise. The sensitivity sweep automatically varies every numeric param in a strategy's definition across a grid and reports what fraction of variants are profitable. A genuine edge survives parameter perturbations; a curve-fitted edge does not.

Enable in config.py:

"sensitivity_sweep_enabled": True,
"sensitivity_sweep_pct": 0.20,    # ±20% per step
"sensitivity_sweep_steps": 2,     # 2 steps each side → 5 values per param
"sensitivity_sweep_min_val": 2,   # floor (prevents e.g. SMA period = 0)

How it works:

For a strategy registered with params={"fast": 20, "slow": 50}, the sweep generates values:

fast: [12, 16, 20, 24, 28] (5 values at ±20% steps)
slow: [30, 40, 50, 60, 70] (5 values at ±20% steps)

This produces a 25-point cartesian grid (5 × 5). Each grid point runs as an independent simulation — they appear as separate rows in all summary tables. Non-numeric params (strings, bools) are carried through unchanged in every variant.

Strategy naming in results:

Strategy name in output	Meaning
`SMA Crossover (20d/50d) [(base)]`	Base parameter values
`SMA Crossover (20d/50d) [fast=16]`	Only `fast` changed from base
`SMA Crossover (20d/50d) [fast=16 slow=40]`	Both params changed

Sensitivity report (printed after the run):

======================================================================
  PARAMETER SENSITIVITY REPORT
======================================================================

  SMA Crossover (20d/50d)
  Robust — profitable in 72% of variants (18/25)

  Variant                             P&L    Sharpe   Max DD   MC Score
  ----------------------------------------------------------------------
  (base)                            14.2%      1.42   18.3%        72 <-- base
  fast=16                           12.8%      1.31   19.1%        65
  fast=24                           11.4%      1.19   20.5%        58
  fast=16 slow=40                    9.7%      1.08   22.3%        51
  fast=12 slow=30                    2.1%      0.21   31.4%        12
  fast=28 slow=70                   -3.4%      0.00   38.2%        -8
  ...
======================================================================

Fragility verdict thresholds:

% of variants profitable	Verdict
≥ 30%	`Robust — profitable in X% of variants (Y/Z)`
< 30%	`* FRAGILE — profitable in only X% of variants *`

Performance note: With 2 numeric params and steps=2, the sweep creates 25× more tasks. With 3 params it's 125×. Keep sensitivity_sweep_enabled: False for normal runs; enable only for targeted fragility checks on candidate strategies.

No-regression guarantee: When sensitivity_sweep_enabled: False (default), the task-building loop is identical to pre-sweep behaviour — param_variants = [base_params], one task per strategy.

Local Report Files

Location	Contents
`output/runs/<run_id>/overall_portfolio_summary.csv`	All results across all portfolios, sorted by MC Score. The first 5 columns are run metadata (`run_id`, `data_provider`, `start_date`, `end_date`, `timeframe`) so results are self-describing when combined across runs.
`output/runs/<run_id>/<Portfolio>_strategy_correlation.csv`	Pearson correlation matrix of daily P&L across all strategies for that portfolio
`output/runs/<run_id>/analyzer_csvs/<Portfolio>/`	Column-mapped CSVs ready to pass into `report.py`
`output/runs/<run_id>/raw_trades/<Portfolio>/`	Per-symbol, per-strategy raw trade logs (when `save_individual_trades=True`)
`output/runs/<run_id>/logs/`	Full execution log for the run
`output/runs/<run_id>/ml_features.parquet`	ML-ready consolidated trade feature file (when `export_ml_features=True`)

ML Feature Export

Enable in config.py:

"export_ml_features": True,   # requires: pip install pyarrow

After the run, ml_features.parquet will contain one row per completed trade across all strategies and portfolios, with the following schema:

Column	Type	ML Role
`Strategy`, `Portfolio`, `Symbol`	str	grouping keys
`EntryDate`, `ExitDate`	Timestamp	temporal features
`HoldDuration`	int	hold-period feature
`EntryPrice`, `ExitPrice`, `Profit`, `ProfitPct`, `Shares`	float	trade economics
`is_win`	int8	classification target (1 = profit, 0 = loss)
`RMultiple`, `MAE_pct`, `MFE_pct`	float	risk/reward features
`ExitReason`, `InitialRisk`	str/float	context features
`entry_RSI_14`, `entry_ATR_14_pct`, `entry_SMA200_dist_pct`, `entry_Volume_Spike`	float	price-action features at entry
`entry_SPY_RSI_14`, `entry_SPY_SMA200_dist_pct`	float	market regime features
`entry_VIX_Close`, `entry_TNX_Close`	float	macro features

The internal Trade counter column is excluded. If pyarrow is not installed, a .csv fallback is written automatically.

S3 Reports (if configured)

All output files are also uploaded to s3://<your-bucket>/<run_id>/. Each run uses its timestamped folder as the S3 key prefix, so previous results are never overwritten.

Generating Detailed Reports

After a run completes, you can generate a detailed PDF or Markdown report for any individual strategy using report.py. This produces equity curves, drawdown charts, trade distribution analysis, and more.

The PDF tearsheet includes an Underwater Plot positioned immediately below the equity curve — a short, wide red-filled chart that visualises both the depth and duration of every drawdown period throughout the backtest.

Single-File Usage

python report.py output/runs/<run_id>/analyzer_csvs/<Portfolio>/<Strategy>.csv

The report is automatically saved to output/runs/<run_id>/detailed_reports/. No --output-dir flag is needed when working with files inside a run folder.

Batch Mode — Generate All Reports for a Run

To generate reports for every strategy in a run at once, pass the run directory with --all:

python report.py --all output/runs/<run_id>

This finds every .csv file recursively under <run_id>/analyzer_csvs/ and generates one report per file. All reports are saved to <run_id>/detailed_reports/. A summary line is printed when complete:

Generated 14 reports in output/runs/2026-03-02_15-12-32/detailed_reports

Examples

# Generate a report for a specific strategy from your last run
python report.py output/runs/2026-03-02_15-12-32/analyzer_csvs/Nasdaq_100/SMA_Crossover_20d_50d.csv

# Generate a report from a named run
python report.py output/runs/nasdaq-sweep_2026-03-02_15-12-32/analyzer_csvs/Nasdaq_100/SMA_Crossover_20d_50d.csv

# Generate all reports for an entire run at once
python report.py --all output/runs/2026-03-02_15-12-32

# Override where the report is saved
python report.py path/to/strategy.csv --output-dir /path/to/custom/folder

# Set a custom name for the report file and folder
python report.py path/to/strategy.csv --report-name "my-strategy-deep-dive"

# Override the initial equity used for equity curve calculations
python report.py path/to/strategy.csv --equity 250000

All report.py Options

Flag	Default	Description
`csv_path`	(required, or use `--all`)	Path to a single backtester-generated CSV
`--all RUN_DIR`	—	Path to a run directory; generates reports for all CSVs under `analyzer_csvs/`
`--output-dir`	Auto-detected	Root directory for report output. Auto-detected when the CSV is inside `analyzer_csvs/`.
`--equity`	100000	Initial equity for equity curve calculations
`--report-name`	CSV filename	Custom name for the generated report file and its parent folder (single-file mode only)

csv_path and --all are mutually exclusive — use one or the other.

Available Strategies

Strategies are loaded automatically from the custom_strategies/ plugin directory. No file outside that directory needs to be edited to add, remove, or rename a strategy.

Currently Active (plugin files)

Plugin file	Registration name	Description
`sma_crossovers.py`	SMA Crossover (20d/50d)	Buy when 20-bar SMA crosses above 50-bar SMA
`sma_crossovers.py`	SMA Crossover (50d/200d)	Classic "golden cross" — 50-bar SMA crosses above 200-bar SMA

Quick-Scan Strategy Matrix

New to the library? Use this table to find a starting point based on your experience level. Then see the full catalogue below for parameters and dependencies.

Category	Example Plugins	Risk / Complexity
Trend Following	SMA Crossover (50/200), MACD Crossover, Donchian Breakout	Low
Mean Reversion	RSI (14/30), Bollinger Band Fade, Stochastic, CMF	Medium
Volatility / Breakout	BB Squeeze, Keltner Breakout, ATR Trailing Stop	High
Scalping (Sub-Daily)	1m EMA Scalp, 1m RSI Extreme Fade, 1m BB Squeeze	Very High
Calendar / Regime	Weekend Hold, Hold the Week, Daily Overnight Hold	Low

Strategy Library — Full Plugin Catalogue

All strategies below are pre-built in custom_strategies/ and inactive by default. To activate any strategy, simply copy the relevant .py file into custom_strategies/ (if not already present) — the engine discovers it automatically on the next run. Individual strategies within a file can be commented out by removing or wrapping their @register_strategy decorator.

RSI Strategies (`custom_strategies/rsi_strategies.py`)

Registration name	Key params	Description
`RSI Mean Reversion (14/30)`	length=14d, oversold=30, exit=50	Buy when RSI crosses back above 30; exit above 50
`RSI Mean Reversion (7/20)`	length=7d, oversold=20, exit=50	Aggressive short-period RSI with tight oversold threshold
`RSI (14d) w/ SMA200 Filter`	rsi=14d, sma=200d	RSI mean reversion, only taken when price is above 200-bar SMA
`1m RSI Extreme Fade (14/20/80)`	rsi=14min, levels=20/80	Sub-daily extreme RSI fade; requires `timeframe = "MIN"`

MACD & EMA Strategies (`custom_strategies/macd_strategies.py`)

Registration name	Key params	Dependencies	Description
`MACD Crossover (12/26/9)`	fast=12d, slow=26d, signal=9d	—	Buy when MACD line crosses above signal line
`MACD+RSI Confirmation`	macd=12/26/9d, rsi=14d	—	MACD crossover gated by RSI > 50
`EMA Crossover (Unfiltered)`	fast=20d, slow=50d	—	Pure EMA crossover, no regime filter
`EMA Crossover w/ SPY-Only Filter`	fast=20d, slow=50d	`spy`	EMA crossover, buys gated by SPY above 200-bar SMA
`EMA Crossover w/ VIX-Only Filter`	fast=20d, slow=50d	`vix`	EMA crossover, buys gated by VIX below 30
`EMA Crossover w/ SPY+VIX Filter`	fast=20d, slow=50d	`spy`, `vix`	EMA crossover, full "Bull-Quiet" regime filter
`1m EMA Scalp (5/15/50)`	emas=5/15/50min	—	Sub-daily EMA scalp; requires `timeframe = "MIN"`

Mean Reversion & Other Strategies (`custom_strategies/mean_reversion.py`)

Registration name	Key params	Dependencies	Description
`Bollinger Band Fade (20d/2.0)`	length=20d, std=2.0	—	Buy below lower band; exit at middle SMA
`Bollinger Band Fade (20d/2.5)`	length=20d, std=2.5	—	Wider-band fade; fewer but more extreme entries
`Bollinger Band Breakout (20d)`	length=20d, std=2	—	Buy above upper band; momentum breakout direction
`Bollinger Band Squeeze (20d/40d)`	length=20d, squeeze=40d	—	Enter breakout after low-volatility squeeze period
`Bollinger Band Fade w/ SPY Trend Filter (20d/2.0)`	length=20d, std=2.0	`spy`	BB fade gated by SPY above 200-bar SMA
`1m BB Squeeze (10/2.0) / 20-period squeeze`	length=10min, squeeze=20min	—	Sub-daily BB squeeze; requires `timeframe = "MIN"`
`1m BB Squeeze (20/2.0) / 50-period squeeze`	length=20min, squeeze=50min	—	Sub-daily BB squeeze; longer lookback variant
`Stochastic Oscillator (14d)`	length=14d, oversold=20, exit=50	—	Buy when %K crosses above 20; exit above 50
`Chaikin Money Flow (10d)`	length=10d, buy=0.0, sell=-0.05	—	Enter on CMF crossover above 0
`Chaikin Money Flow (20d/0.05/0.05)`	length=20d, buy=0.05, sell=-0.05	—	Symmetric CMF thresholds, slower signal
`OBV Trend (20d MA)`	ma=20d	—	Long when On-Balance Volume is above its 20-bar SMA
`MA Bounce (20d)`	ma=20d, filter=2 bars	—	Buy on 20-bar MA touch-and-recover pattern
`SMA 200 Trend Filter (200d)`	ma=200d	—	Long when Close > 200-bar SMA; flat otherwise
`MA Confluence (Full Stack)`	fast=10d, mid=20d, slow=50d	—	Enter on bullish MA stack; exit on bearish stack
`MA Confluence (Fast Entry & Exit)`	fast=10d, mid=20d, slow=50d	—	Aggressive entry AND aggressive exit
`MA Confluence (Fast MA Exit)`	fast=10d, mid=20d, slow=50d	—	Conservative entry; fast-MA exit
`MA Confluence (Fast Entry)`	fast=10d, mid=20d, slow=50d	—	Fast entry; conservative bearish-stack exit
`MA Confluence (Medium MA Exit)`	fast=10d, mid=20d, slow=50d	—	Conservative entry; medium-MA exit
`MA Confluence (Full Stack) w/ Regime Filter`	fast=10d, mid=20d, slow=50d	`spy`, `vix`	MA Confluence + full SPY+VIX regime filter
`Donchian Breakout (20d/10d)`	entry=20d, exit=10d	—	Buy on 20-bar high; exit on 10-bar low
`Keltner Channel Breakout (20d)`	ema=20d, atr=20d, mult=2.0	—	Buy above Keltner upper channel
`ATR Trailing Stop (14/3)`	atr=14d, mult=3.0	—	SMA-200 breakout entry with ATR trailing stop
`ATR Trailing Stop w/ Trend Filter`	entry=20d, atr=14d, sma=200d	—	Donchian breakout entry + ATR trailing stop + SMA filter
`Hold The Week (Tue-Fri)`	—	—	Calendar: buy Mon close, sell Thu close
`Weekend Hold (Fri-Mon)`	—	—	Calendar: buy Thu close, sell Fri close
`Daily Overnight Hold (weekdays) w/ VIX Filter`	—	`vix`	Weekday overnight hold when VIX < 20

Configuration Reference

Setting	Default	Description
`data_provider`	`"polygon"`	`"polygon"` or `"norgate"`
`upload_to_s3`	`False`	Enable S3 uploads of output files
`s3_reports_bucket`	—	S3 bucket name. Requires `upload_to_s3: True`.
`start_date`	`"2004-01-01"`	Backtest start date (YYYY-MM-DD)
`end_date`	Today	Backtest end date
`initial_capital`	`100000.0`	Starting account size in dollars
`timeframe`	`"D"`	Bar frequency: `"D"` daily, `"H"` hourly, `"MIN"` minute, `"W"` weekly, `"M"` monthly
`timeframe_multiplier`	`1`	For sub-daily bars only — e.g., `5` with `"MIN"` gives 5-minute bars
`price_adjustment`	`"total_return"`	`"total_return"` (dividend-adjusted) or `"none"`
`benchmark_symbol`	`"SPY"`	Primary benchmark ticker
`symbols_to_test`	`['BITB']`	Tickers for single-asset mode
`portfolios`	(see config)	Portfolios dict for portfolio mode
`allocation_per_trade`	`0.10`	Fraction of equity per new position (0.10 = 10%)
`execution_time`	`"open"`	Fill at next-day open price
`stop_loss_configs`	`[{"type": "none"}]`	List of stop-loss configurations to test
`slippage_pct`	`0.0005`	Flat slippage as fraction of price applied to every fill (0.0005 = 5 basis points)
`commission_per_share`	`0.002`	Commission in dollars per share
`max_pct_adv`	`0.05`	Position size cap: no order may exceed this fraction of 20-day average daily volume
`volume_impact_coeff`	`0.0`	Square-root market impact added on top of `slippage_pct`. `0.0` = disabled. See note below.
`min_trades_for_mc`	`50`	Minimum trades required to run Monte Carlo
`num_mc_simulations`	`1000`	Number of Monte Carlo simulations per strategy
`mc_sampling`	`"iid"`	MC sampling method: `"iid"` (independent, default) or `"block"` (block-bootstrap, preserves streaks)
`mc_block_size`	`None`	Block size for `"block"` sampling. `None` = auto (`floor(sqrt(N))`)
`save_individual_trades`	`True`	Save per-trade CSV logs to `raw_trades/`
`save_only_filtered_trades`	`False`	If True, only save logs for strategies passing the display filters
`mc_score_min_to_show_in_summary`	`-9999`	Minimum MC score to include in output table
`min_pandl_to_show_in_summary`	`-9999`	Minimum P&L % to include in output table
`max_acceptable_drawdown`	`1.0`	Maximum drawdown (as a decimal) to include in output table
`min_performance_vs_spy`	`-9999`	Minimum outperformance vs SPY to include in output table
`min_performance_vs_qqq`	`-9999`	Minimum outperformance vs QQQ to include in output table
`show_qqq_losers`	`False`	If False, hides strategies that underperform QQQ
`wfa_split_ratio`	`0.80`	Walk-Forward Analysis IS/OOS split. `0.80` = first 80% of data is In-Sample, last 20% is Out-of-Sample. Set to `None` or `0` to disable.
`wfa_folds`	`None`	Rolling multi-fold WFA. `None` = disabled; integer ≥ 2 = number of equal-width OOS folds. Adds a `Rolling WFA` column to all summary tables.
`wfa_min_fold_trades`	`5`	Minimum OOS trades required to score a fold in rolling WFA. Folds with fewer trades are skipped.
`export_ml_features`	`False`	When `True`, writes `ml_features.parquet` (one row per trade, all strategies) after the run. Requires `pip install pyarrow`. Falls back to `.csv` if pyarrow is absent.
`roc_thresholds`	`[0.0, 0.5]`	Rate-of-change thresholds for ROC Momentum strategy
`strategies`	`"all"`	`"all"` runs every plugin; a list of exact strategy names runs only those (see Strategy Selection)
`sensitivity_sweep_enabled`	`False`	Opt-in parameter sensitivity sweep — varies each numeric param ±pct across ±steps steps
`sensitivity_sweep_pct`	`0.20`	Fractional step size (0.20 = ±20% per step)
`sensitivity_sweep_steps`	`2`	Steps each side of base value (2 steps → 5 values per param)
`sensitivity_sweep_min_val`	`2`	Floor for generated values (prevents e.g. SMA period = 0)
`rolling_sharpe_window`	`126`	Rolling Sharpe window in trading days (~6 months). Set to `0` or `None` to disable.
`htb_rate_annual`	`0.02`	Annual Hard-To-Borrow rate (2% = easy-to-borrow large cap; 10% = HTB small/mid cap). Debited daily while a short position is held. Set to `0.0` to disable borrow cost.

Short Selling

The engine supports short positions via the -2 signal convention. All existing strategies use 1/0/-1 and are fully backward-compatible.

Signal	Meaning
`1`	Enter long
`0`	No change
`-1`	Exit long or cover short
`-2`	Enter short

How it works: When a strategy emits -2 for a symbol, a short position is opened at the next bar's Open (or Close, depending on execution_time). The short seller receives the proceeds into cash. Each subsequent day, a Hard-To-Borrow fee is debited: notional × ((1 + htb_rate_annual)^(1/252) - 1). When the strategy emits -1, the position is covered and the borrow cost is netted against the P&L.

Short trades in the output: Short trades appear in trade CSVs and summary tables with ExitReason: "Short Cover". They are included in all P&L, Sharpe, and Monte Carlo calculations alongside long trades.

Configuring borrow cost:

"htb_rate_annual": 0.02,  # 2% p.a. (easy-to-borrow, e.g. large-cap S&P 500)
"htb_rate_annual": 0.10,  # 10% p.a. (hard-to-borrow, e.g. high-short-interest small cap)
"htb_rate_annual": 0.0,   # disabled — no borrow cost modelled

Regime Heatmap

After each strategy run, the engine prints a VIX Regime Heatmap — a year × volatility-bucket P&L table that shows whether a strategy's edge is regime-dependent.

VIX buckets:

Bucket	VIX range
Low (<15)	VIX below 15 — calm, low-fear environment
Mid (15–25)	VIX 15 to 25 — normal / moderate volatility
High (>25)	VIX above 25 — elevated fear / stress

Each trade's entry date is classified into a bucket using the VIX close on that date (forward-filled from the prior trading day for weekends and holidays). P&L is expressed as a fraction of initial capital.

Example terminal output:

--- REGIME HEATMAP: MA Crossover ---
  Year        Low (<15)    Mid (15-25)     High (>25)
----------------------------------------------------
  2022           +0.0%         -3.4%         +1.2%
  2023           +5.1%         +2.8%          0.0%
  2024           +3.3%         +1.6%         -0.5%
----------------------------------------------------
  TOTAL          +8.4%         +1.0%         +0.7%

Interpretation: A strategy that shows strong positive returns only in Low (<15) and flat or negative in High (>25) is regime-dependent — it may struggle in volatile markets. A robust strategy should show consistent positive contribution across all three buckets.

Configuration: The heatmap uses VIX data loaded as the vix_df_global ticker. No extra config keys are required — the output appears automatically whenever VIX data is available and the trade log is non-empty.

Data Caching

Downloaded price data is cached locally in data_cache/ as Parquet files with a 24-hour TTL.

First run for a date range fetches every symbol from Polygon — this is slow for large portfolios (30–60 minutes for Nasdaq-level runs).
Subsequent runs within 24 hours load from disk — typically seconds per symbol.
To force a fresh fetch, delete the data_cache/ folder or individual .parquet files inside it.
data_cache/ is excluded from git via .gitignore and should never be committed.

Cache files are named using the pattern {symbol}_{start}_{end}_{timeframe}_{multiplier}.parquet. Symbols with special characters (e.g., I:VIX) are sanitized for safe filenames.

Stale cache warning: At the start of each run, the backtester scans data_cache/ for Parquet files older than 7 days and logs a warning if any are found. This is a prompt to delete the folder if your strategies need fresh data — the 24-hour TTL governs in-memory freshness, but on-disk files are not automatically removed after that window.

Adding Custom Strategies (Plugin System)

No core file edits required. Drop a .py file into custom_strategies/, decorate your function, and the engine discovers it automatically on the next run.

The backtester uses a strategy plugin system built around helpers/registry.py. The @register_strategy decorator stores a strategy's name, logic function, dependencies, and parameters. load_strategies("custom_strategies") is called at startup and imports every .py file in that directory, triggering the decorators.

custom_strategies/          <-- The "Drop-Zone"
├── my_new_signal.py        <-- 1. Create a file here
├── rsi_strategies.py
└── mean_reversion.py

[ 2. The engine auto-discovers and registers it at runtime ]

Skeleton Strategy — copy and customise

# custom_strategies/my_strategy.py

from helpers.registry import register_strategy
from helpers.timeframe_utils import get_bars_for_period
from config import CONFIG

_TF  = CONFIG.get("timeframe", "D")
_MUL = CONFIG.get("timeframe_multiplier", 1)

# Optional: import a logic helper from helpers/indicators.py
# from helpers.indicators import my_logic_function

@register_strategy(
    name="My Strategy Name",           # shown in all reports and CSVs
    dependencies=[],                   # add "spy" and/or "vix" if needed
    params={
        "length": get_bars_for_period("20d", _TF, _MUL),
        # add more params here — they are merged into **kwargs at runtime
    },
)
def my_strategy(df, **kwargs):
    """One-line description shown in the strategy docstring."""
    length = kwargs["length"]

    # --- your signal logic here ---
    # df['Signal'] must be populated before returning:
    #   1  = enter / hold long
    #  -1  = exit / go flat
    #   0  = no change (carry previous signal — forward-fill where needed)
    df['Signal'] = 0  # replace with real logic
    return df

Step-by-step: add a new strategy

1. (Optional) Add shared signal logic to helpers/indicators.py if you want it reusable across multiple plugins. You can also write logic inline in the plugin file.

2. Create a .py file in custom_strategies/ using the skeleton above. Any filename works as long as it doesn't start with _.

3. Run the backtester — no other files need touching:

python main.py --dry-run   # verify "Strategies: N" increased by 1
python main.py

That's it. The engine imports every .py file in custom_strategies/ at startup, the decorator fires, and the strategy name appears in every summary table and CSV.

Signal convention

Value	Meaning
`1`	Enter / hold long
`-1`	Exit / go flat
`0`	No change (carry previous signal)

Strategies that need SPY or VIX data

Declare dependencies=["spy"], dependencies=["vix"], or dependencies=["spy", "vix"]. The engine automatically injects spy_df and/or vix_df into **kwargs at runtime:

@register_strategy(
    name="EMA w/ SPY Filter",
    dependencies=["spy"],
    params={
        "fast": get_bars_for_period("20d", _TF, _MUL),
        "slow": get_bars_for_period("50d", _TF, _MUL),
    },
)
def ema_spy_filter(df, **kwargs):
    spy_df = kwargs["spy_df"]   # injected automatically by main.py
    fast   = kwargs["fast"]
    slow   = kwargs["slow"]
    # ... use spy_df in your logic
    return df

Timeframe-agnostic bar counts

Always use get_bars_for_period("20d", _TF, _MUL) instead of raw integers. This converts a human-readable period string into the correct bar count for whatever timeframe is configured — the same strategy works on daily, hourly, or minute bars without any code changes.

Enabling or disabling a strategy without deleting it

Comment out or remove the @register_strategy(...) decorator. The function still exists in the file but will not be registered. Alternatively, delete the file entirely to remove all strategies in it — load_strategies silently skips missing files.

Running a specific subset of strategies (`config.py`)

Set the "strategies" key in config.py to run only a named subset of installed plugins without touching any plugin files:

# config.py — run every registered strategy (default)
"strategies": "all",

# config.py — run only these three
"strategies": [
    "SMA Crossover (20d/50d)",
    "RSI Mean Reversion (14/30)",
    "EMA Crossover w/ SPY+VIX Filter",
],

Names must match the name argument in @register_strategy exactly (case-sensitive). Any name that is not found in the registry logs a [WARNING] and is skipped — a typo will not crash the run. Confirm the active count with --dry-run before starting a long backtest.

Project Structure

july-backtester/
├── main.py                       # Single entry point — portfolio and single-asset mode
├── config.py                     # All configuration — edit this before running
├── report.py                     # CLI tool to generate PDF/Markdown reports from CSVs
├── requirements.txt              # Python dependencies
├── .env.example                  # Copy to .env and add your Polygon API key
├── strategies.py                 # Legacy file — static strategy definitions (still supported but not required)
│
├── custom_strategies/            # Plugin directory — drop *.py files here to add strategies
│   ├── sma_crossovers.py         # SMA Crossover (20d/50d) and (50d/200d)  [active]
│   ├── rsi_strategies.py         # RSI Mean Reversion, RSI+SMA200, RSI Scalping
│   ├── macd_strategies.py        # MACD Crossover, MACD+RSI, EMA Crossover variants, EMA Scalp
│   └── mean_reversion.py         # Bollinger, Stochastic, ATR, Donchian, Keltner, CMF, OBV,
│                                 #   MA Confluence, calendar/overnight strategies
│
├── helpers/
│   ├── indicators.py             # All strategy signal logic — do not edit
│   ├── registry.py               # @register_strategy decorator, load_strategies, REGISTRY
│   ├── simulations.py            # Single-asset trade simulation engine
│   ├── portfolio_simulations.py  # Multi-asset portfolio simulation engine
│   ├── monte_carlo.py            # Monte Carlo robustness analysis
│   ├── summary.py                # Report generation, CSV export, S3 upload
│   ├── caching.py                # Local Parquet cache (24h TTL)
│   ├── aws_utils.py              # S3 upload helper; reads API key from env/.env
│   ├── timeframe_utils.py        # Bar period conversion utilities
│   ├── wfa.py                    # Walk-Forward Analysis (single-split)
│   ├── wfa_rolling.py            # Rolling multi-fold WFA
│   └── ml_export.py              # ML-ready trade feature export
│   └── correlation.py            # Strategy correlation matrix
│
├── services/
│   ├── services.py               # Data provider factory (caching wrapper)
│   ├── polygon_service.py        # Polygon.io API integration
│   ├── norgate_service.py        # Norgate Data integration
│   ├── yahoo_service.py          # Yahoo Finance via yfinance (no API key)
│   └── csv_service.py            # Local CSV files
│
├── trade_analyzer/               # Standalone report generation module
│   ├── analyzer.py               # Main entry point for report generation
│   ├── calculations.py           # Metrics and statistics
│   ├── plotting.py               # Chart generation
│   ├── report_generator.py       # PDF/Markdown output
│   └── ...
│
├── scripts/                      # One-off diagnostic and utility scripts (not part of the main pipeline)
│   └── debug_data.py             # Compare Polygon vs Yahoo SPY data (provider diagnostic)
│
└── tickers_to_scan/              # JSON ticker lists
    ├── nasdaq_100.json
    ├── sp-500.json
    ├── russell_1000.json
    └── ...                       # (and many more)

Contributing

Contributions are welcome. To contribute:

Fork the repository
Create a branch: git checkout -b feature/my-new-strategy
Add your signal logic to helpers/indicators.py (or inline it in your plugin file)
Create a plugin file in custom_strategies/ using the @register_strategy decorator (see Adding Custom Strategies above)
Validate with python main.py --dry-run to confirm the strategy count increases
Run a quick backtest on a small portfolio (e.g., {"Test": ["SPY", "QQQ", "AAPL"]}) to confirm it runs without errors
Open a pull request describing the strategy logic, parameters, and any sample results

Please do not commit API keys, .env files, data_cache/ contents, or the generated output/ folder. These are all covered by .gitignore.

License

MIT License — free to use, modify, and distribute. See LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
.claude		.claude
.github		.github
custom_strategies		custom_strategies
helpers		helpers
scripts		scripts
services		services
tests		tests
tickers_to_scan		tickers_to_scan
trade_analyzer		trade_analyzer
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
config.py		config.py
conftest.py		conftest.py
main.py		main.py
report.py		report.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

July Backtester

Table of Contents

What This Tool Does

Prerequisites

Installation

Step 1 — Clone the Repository

Step 2 — Create a Python Virtual Environment

Step 3 — Install Dependencies

API Key Setup

(Optional) Set Up an S3 Bucket for Reports

Configuration

Quick Setup Checklist

Data Provider Settings

Polygon.io (default)

Norgate Data

Yahoo Finance

CSV Data Provider

Backtest Period

Capital and Position Sizing

What to Test

Stop Loss Configuration

Slippage and Commission

Output Filters

Running the Backtester

CLI Flags

Portfolio Mode (Primary Use)

Dry Run — Validate Without Fetching Data

Single-Asset Mode

First Run Tips

Understanding the Output

Run Output Folder

Run Summary Box

Terminal Summary Table

Core Metrics Glossary

Additional Metrics in the PDF Report

Monte Carlo Score Explained

Walk-Forward Analysis (WFA)

Rolling Multi-Fold WFA (opt-in)

R-Multiple, Expectancy, and SQN

Price Noise Injection (Stress Testing)

Strategy Correlation Matrix

Parameter Sensitivity Sweep

Local Report Files

ML Feature Export

S3 Reports (if configured)

Generating Detailed Reports

Single-File Usage

Batch Mode — Generate All Reports for a Run

Examples

All report.py Options

Available Strategies

Currently Active (plugin files)

Quick-Scan Strategy Matrix

Strategy Library — Full Plugin Catalogue

RSI Strategies (custom_strategies/rsi_strategies.py)

MACD & EMA Strategies (custom_strategies/macd_strategies.py)

Mean Reversion & Other Strategies (custom_strategies/mean_reversion.py)

Configuration Reference

Short Selling

Regime Heatmap

Data Caching

Adding Custom Strategies (Plugin System)

Skeleton Strategy — copy and customise

Step-by-step: add a new strategy

Signal convention

Strategies that need SPY or VIX data

Timeframe-agnostic bar counts

Enabling or disabling a strategy without deleting it

Running a specific subset of strategies (config.py)

Project Structure

Contributing

License

About

Topics

Resources

Uh oh!

RSI Strategies (`custom_strategies/rsi_strategies.py`)

MACD & EMA Strategies (`custom_strategies/macd_strategies.py`)

Mean Reversion & Other Strategies (`custom_strategies/mean_reversion.py`)

Running a specific subset of strategies (`config.py`)

Packages