Skip to content

Latest commit

 

History

History
82 lines (61 loc) · 2.68 KB

File metadata and controls

82 lines (61 loc) · 2.68 KB

Regression Models for Slippage and Trade Analysis

Overview

This document explains the regression modeling techniques used in the Trade Simulator application for slippage estimation and maker/taker proportion prediction.

Slippage Estimation Models

Linear Regression Model

The Trade Simulator uses linear regression as the primary method for estimating slippage based on orderbook depth and trade size.

Feature Engineering

The model extracts the following features from the orderbook:

  • Bid-ask spread
  • Order book imbalance ratio
  • Market depth at various price levels
  • Volume-weighted price levels
  • Trade size relative to available liquidity

Training Process

The slippage model is trained through:

  1. Initial training with synthetic or historical data
  2. Online learning to adapt to changing market conditions
  3. Periodic retraining with recent market data

Prediction Accuracy

The linear regression approach provides:

  • Low computational overhead
  • Real-time predictions (<1ms)
  • Accuracy within 5-10% of actual slippage under normal market conditions

Quantile Regression (Alternative Approach)

For handling extreme market conditions and tail risks, a quantile regression model is also implemented.

Benefits of Quantile Regression

  • Provides prediction intervals rather than point estimates
  • More robust to outliers and extreme market movements
  • Better captures asymmetric slippage distributions

Maker/Taker Proportion Model

Logistic Regression Model

The maker/taker proportion is predicted using a logistic regression model, which estimates the probability of orders executing as maker vs. taker.

Feature Set

Features used for classification include:

  • Bid-ask spread
  • Order imbalance
  • Recent trade volume
  • Volatility
  • Time of day

Training Challenges

The model addresses several challenges:

  • Class imbalance (synthetic data generation for underrepresented classes)
  • Online adaptation to market regime changes
  • Feature normalization for robust predictions

Performance Metrics

The maker/taker model achieves:

  • 80% accuracy in typical market conditions

  • <5ms prediction time
  • Correct prediction of dominant execution type in >90% of cases

Model Evaluation and Validation

All regression models are continuously evaluated using:

  • Mean squared error (MSE) for slippage prediction
  • Classification accuracy for maker/taker prediction
  • Computation time performance
  • Memory usage efficiency

Implementation Details

The models are implemented using:

  • Scikit-learn for core regression algorithms
  • NumPy for efficient numerical operations
  • Custom feature extraction pipelines for orderbook data
  • Online learning mechanisms for continuous improvement