OSRS Pairs Trading: Statistical Arbitrage with Kalman Filtering

Overview

This repository implements a high-frequency statistical arbitrage strategy tailored for the Old School RuneScape (OSRS) Grand Exchange. It focuses on the cointegrated relationship between Ranarr Weed (raw material) and Prayer Potion(3) (finished product).

The strategy utilizes a Kalman Filter to dynamically estimate hedge ratios and a Regularized Rolling Optimizer to solve for the optimal entry threshold that maximizes expected PnL in the presence of significant transaction costs (1% tax).

(Figure 1: Net Equity Curve demonstrating strategy performance after transaction costs)

The Economic Thesis

In OSRS, "Herblore" is a production skill where players process raw herbs into potions. The price relationship between Ranarr Weed ($P_{weed}$) and Prayer Potion(3) ($P_{pot}$) is fundamental:

Production Chain: Weed is the primary input for Potion.
No Arbitrage Bounds: If the spread diverges significantly (e.g., Potion becomes too expensive relative to Weed), players will process more herbs, increasing supply and forcing mean reversion. Conversely, if Potions are too cheap, production halts, restricting supply.
Cointegration: While individual prices follow random walks, their linear combination (the spread) is stationary: $$\ln(P_{weed, t}) - \gamma_t \ln(P_{pot, t}) \sim \mathcal{N}(0, \sigma^2)$$

Methodology

1. Dynamic Hedge Ratio (Kalman Filter)

We treat the hedge ratio $\gamma_t$ not as a constant, but as a time-varying state variable. We employ a Kalman Filter with a momentum transition model to track the structural relationship without lookahead bias:

State: $x_t = [\mu_t, \gamma_t, \dot{\gamma}_t]^T$
Observation: $y_t = \ln(P_{weed, t}) - (\mu_t + \gamma_t \ln(P_{pot, t}))$

2. Cost-Aware Threshold Optimization

A critical challenge in OSRS is the 1% tax on sell orders, resulting in a significant round-trip cost ($C_{tax} \approx 2%$). Standard fixed-threshold strategies (e.g., enter at $2\sigma$) fail because they ignore the volatility regime; in low volatility, a $2\sigma$ move often fails to cover the fixed tax barrier.

This project implements a Regularized Profit Maximization algorithm. At each step, we solve for the threshold $s_0$ that maximizes the expected profit rate.

Step A: Empirical Crossing Rate We define a grid of candidate Z-score thresholds $\mathbf{s} = [s_1, s_2, \dots, s_J]^T$. For each candidate threshold $s_j$, we calculate the raw empirical frequency of profitable crossings $\bar{f}_j$ over the lookback window $T$:

$$\bar{f}_j = \frac{1}{T} \sum_{t=1}^{T-1} \mathbb{1}_{{z_t < s_j \land z_{t+1} \ge s_j}}$$

This counts how often the spread crosses the threshold $s_j$ from below, representing a potential trade entry.

Step B: Tikhonov Regularization Raw crossing counts $\bar{\mathbf{f}}$ are noisy and non-monotonic due to market microstructure noise. To obtain a robust estimate of trade frequency $\mathbf{f}^*$, we solve a Tikhonov regularization problem that balances fidelity to the data with smoothness:

$$ \underset{\boldsymbol{f}}{\mathrm{minimize}} \quad \Vert \boldsymbol{f} - \overline{\boldsymbol{f}} \Vert_2^2 + \lambda \Vert \boldsymbol{D} \boldsymbol{f} \Vert_2^2 $$

Where $\boldsymbol{D}$ is the first-difference operator matrix and $\lambda$ is the smoothing parameter. This convex problem has a closed-form solution:

$$\boldsymbol{f}^* = (\boldsymbol{I} + \lambda \boldsymbol{D}^T \boldsymbol{D})^{-1} \overline{\boldsymbol{f}}$$

Step C: Objective Function We select the optimal threshold $s_{opt}$ by maximizing the expected PnL rate, which is the product of the smoothed trade frequency and the expected profit per trade (adjusted for volatility and tax):

$$ s_{opt} = \underset{s \in \mathbf{s}}{\mathrm{arg,max}} \left[ f^*(s) \cdot \left( s \cdot \bar{\sigma}_{spread} - C_{tax} \right) \right] $$

If $\max(\mathbb{E}[\text{PnL}]) \le 0$, the system outputs a "No Trade" signal, effectively filtering out low-volatility regimes where the tax barrier cannot be overcome.

(Figure 2: Visualizing entry/exit points relative to the dynamic cost-adjusted bands)

Data Pipeline

Data is sourced from the OSRS Wiki Realtime API. The src/data_collection.py module handles:

Rate Limiting: Respects API etiquette.
Alignment: Synchronizes disparate 5-minute buckets for both assets.
Forward Filling: Implements strict forward-filling for prices (last known price) while zero-filling volume to preserve market reality.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
backtest.py		backtest.py
generate_dataset.py		generate_dataset.py
kalman.py		kalman.py
main.py		main.py
osrs_pairs_history_v5.csv		osrs_pairs_history_v5.csv
performance		performance
signals.png		signals.png
strategy.py		strategy.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OSRS Pairs Trading: Statistical Arbitrage with Kalman Filtering

Overview

The Economic Thesis

Methodology

1. Dynamic Hedge Ratio (Kalman Filter)

2. Cost-Aware Threshold Optimization

Data Pipeline

About

Uh oh!

Releases

Packages

Languages

KoperSloper/OSRS-Pairs-Trading

Folders and files

Latest commit

History

Repository files navigation

OSRS Pairs Trading: Statistical Arbitrage with Kalman Filtering

Overview

The Economic Thesis

Methodology

1. Dynamic Hedge Ratio (Kalman Filter)

2. Cost-Aware Threshold Optimization

Data Pipeline

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages