DiffML: Differential Machine Learning Implementation

A professional PyTorch implementation of experiments from the paper "Differential ML with a Difference" by Paul Glasserman and Siddharth Hemant Karmarkar (2025). This repository demonstrates the application of differential machine learning techniques to pricing and hedging financial derivatives, with a focus on improving convergence speed and accuracy of sensitivities (Greeks).

🎯 Overview

Differential Machine Learning (DML) is a technique that leverages automatic differentiation to train neural networks not only on target values but also on their sensitivities (derivatives). This approach significantly improves:

Convergence speed: Faster training with fewer samples
Accuracy: Better approximation of both values and sensitivities
Stability: More robust Greeks calculation for risk management

This implementation covers five major experiment families from the original paper:

Digital Options: Discontinuous payoffs with likelihood ratio method (LRM)
Barrier Options: Path-dependent derivatives with knock-out features
Basket Options: High-dimensional problems with Bachelier model
Smoothing Techniques: Ramp smoothing for discontinuous payoffs
Gamma Hedging: Second-order sensitivities for portfolio management

✨ Key Features

🚀 Production-Ready Code: Full type hints, comprehensive docstrings, and extensive testing
📊 Complete Experiments: All five experiment families from the paper
🔧 Modular Architecture: Clean separation of concerns with reusable components
⚡ GPU Support: Automatic CUDA detection and optimization
📈 Double Precision: Default float64 for numerical accuracy
🧪 Extensive Testing: Unit tests, integration tests, and smoke tests
📝 Professional Documentation: Detailed docstrings and usage examples
🔄 CI/CD Pipeline: GitHub Actions for automated testing and deployment

📦 Installation

Prerequisites

Python 3.10 or higher
Poetry (for dependency management)
CUDA (optional, for GPU acceleration)

Setup

Clone the repository:

git clone https://github.com/diogoribeiro7/diffml.git
cd diffml

Install dependencies using Poetry:

poetry install

Activate the virtual environment:

poetry shell

Install pre-commit hooks (optional, for development):

pre-commit install

🚀 Quick Start

Configuration-Based Experiments (Recommended)

Run experiments using TOML configuration files:

# List available experiments
poetry run python scripts/run_experiment.py --list

# Run an experiment with configuration
poetry run python scripts/run_experiment.py --config configs/digital_default.toml

# Validate a configuration without running
poetry run python scripts/run_experiment.py --validate configs/digital_default.toml

Each configuration file feeds into diffml_article_replication.config_experiments.ExperimentConfig, and the experiment implementation is looked up via the registry in diffml_article_replication.experiments_registry. This keeps the experiment logic decoupled from the CLI so you can tweak parameters or add new experiments by dropping an additional TOML file and registering a function.

Available configurations:

configs/digital_default.toml - Digital option experiment
configs/barrier_default.toml - Barrier option experiment
configs/basket_digital_default.toml - Basket option experiment
configs/asian_default.toml - Asian option experiment
configs/smoothing_default.toml - Smoothing experiment

Run All Experiments

Execute all experiments from the paper:

poetry run python scripts/run_all_experiments.py

Benchmark & Sensitivity Analysis

Generate benchmark tables plus a lambda_delta sweep for the digital experiment:

poetry run python scripts/run_benchmark_digital.py --seeds 0 1 --lambda-delta-values 0.0 0.5 1.0

Path-Dependent Experiments

Run the arithmetic Asian and fixed-strike lookback experiments with a single CLI:

# Run both experiments
poetry run python scripts/run_path_dependent_experiments.py

# Only run the Asian setup
poetry run python scripts/run_path_dependent_experiments.py --experiment asian

Run Individual Experiments (Programmatic)

# Digital options with discontinuous payoffs
from diffml.experiments_digital import run_digital_experiment
run_digital_experiment()

# Barrier options (down-and-out calls)
from diffml.experiments_barrier import run_barrier_experiment
run_barrier_experiment()

# High-dimensional basket options
from diffml.experiments_basket import run_basket_digital_experiment
run_basket_digital_experiment()

# Smoothing technique comparison
from diffml.experiments_smoothing import run_smoothing_experiment
run_smoothing_experiment()

# Portfolio gamma hedging
from diffml.experiments_gamma import run_gamma_experiment
run_gamma_experiment()

Custom Usage Example

import torch
from diffml.config import BSParams, TrainingConfig
from diffml.networks import PricingNet
from diffml.datasets_digital import make_digital_dataset
from diffml.training import train_model

# Configure Black-Scholes parameters
params = BSParams(r=0.05, sigma=0.2, T=0.25)

# Generate dataset
x_train, price_train, delta_pw, delta_lrm = make_digital_dataset(
    m=1000,          # Number of samples
    K=100.0,         # Strike price
    params=params,
    n_paths_per_x=100
)

# Create neural network
model = PricingNet(input_dim=1, hidden_dim=20, n_hidden=4)

# Configure training
config = TrainingConfig(
    n_epochs=1000,
    batch_size=256,
    lr_initial=1e-3,
    lambda_delta=1.0  # Weight for delta regularization
)

# Train with differential ML
model = train_model(
    model=model,
    dataset=TensorDataset(x_train, price_train, delta_pw, delta_lrm),
    config=config,
    mode="delta_lrm"
)

Unified Simulator API Example

import torch
from diffml.config import BSParams, TrainingConfig
from diffml_article_replication.api import diffml_price
from diffml_article_replication.simulator_api import DigitalCallSimulator

simulator = DigitalCallSimulator(strike=1.0, params=BSParams(r=0.0, sigma=0.2, T=1.0 / 3.0))
x_train = torch.linspace(0.5, 1.5, 64).reshape(-1, 1)
x_test = torch.tensor([[0.9], [1.0], [1.1]])

prices, deltas, _ = diffml_price(
    simulator=simulator,
    x_train=x_train,
    x_test=x_test,
    mode="delta_lrm",
    training_config=TrainingConfig(n_epochs=200, batch_size=64, lr_initial=1e-3),
    lambda_delta=1.0,
    seed=0,
)
print("Prices:", prices.squeeze(-1))
print("Deltas:", deltas.squeeze(-1))

📁 Project Structure

diffml/
├── src/diffml/
│   ├── __init__.py
│   ├── config.py                 # Configuration classes and utilities
│   ├── networks.py               # Neural network architectures
│   ├── bs_analytics.py          # Black-Scholes analytical formulas
│   ├── simulation.py             # Monte Carlo simulation engine
│   ├── losses.py                 # DML loss functions
│   ├── training.py               # Training loops and utilities
│   ├── datasets_digital.py      # Digital option dataset generation
│   ├── datasets_barrier.py      # Barrier option dataset generation
│   ├── datasets_basket.py       # Basket option dataset generation
│   ├── datasets_smoothing.py    # Smoothed payoff dataset generation
│   ├── datasets_gamma_portfolio.py  # Portfolio gamma dataset
│   ├── experiments_digital.py   # Digital option experiments
│   ├── experiments_barrier.py   # Barrier option experiments
│   ├── experiments_basket.py    # Basket option experiments
│   ├── experiments_smoothing.py # Smoothing technique experiments
│   └── experiments_gamma.py     # Gamma hedging experiments
├── tests/                        # Comprehensive test suite
│   ├── test_bs_analytics.py
│   ├── test_simulation_and_datasets.py
│   ├── test_training_loop.py
│   └── test_experiments_smoke.py
├── scripts/
│   └── run_all_experiments.py   # Main experiment runner
├── .github/
│   ├── workflows/
│   │   ├── ci.yml               # Continuous integration
│   │   └── release.yml          # PyPI release automation
│   └── dependabot.yml           # Dependency updates
├── pyproject.toml                # Poetry configuration
├── README.md                     # This file
├── CONTRIBUTING.md               # Contribution guidelines
├── CHANGELOG.md                  # Version history
├── CITATION.cff                 # Citation information
└── LICENSE                       # MIT License

🧪 Experiments

1. Digital Options

Demonstrates DML on options with discontinuous payoffs:

Challenge: Pathwise derivatives are zero almost everywhere
Solution: Likelihood Ratio Method (LRM) for sensitivity estimation
Results: Significant improvement in delta accuracy

2. Barrier Options

Down-and-out call options with path-dependent features:

Challenge: Path dependency and barrier conditions
Solution: Two-step simulation with pathwise and LRM methods
Results: Better price and delta estimation near barriers

3. Basket Options

High-dimensional basket digital options (20 assets):

Model: Bachelier model for multi-asset dynamics
Challenge: Curse of dimensionality
Results: DML scales well to high dimensions

4. Smoothing Techniques

Comparison of different smoothing parameters:

Method: Ramp smoothing with varying epsilon
Trade-off: Smoothness vs. accuracy
Results: Optimal epsilon around 0.5-1.0

5. Gamma Portfolio

Second-order sensitivities for portfolio hedging:

Portfolio: Butterfly spread (3 strikes)
Method: Combined pathwise-LRM (PW-LR) for gamma
Results: Improved gamma estimates for dynamic hedging

📖 API Documentation

Core Modules

`config.py`

Configuration classes for experiments:

BSParams: Black-Scholes parameters (r, σ, T)
TrainingConfig: Training hyperparameters
get_device(): Automatic CPU/CUDA device selection
set_default_dtype(): Set PyTorch to float64

`networks.py`

Neural network architectures:

PricingNet: Feedforward network with Softplus activation
- Configurable depth and width
- Automatic gradient computation support

`bs_analytics.py`

Analytical Black-Scholes formulas:

bs_digital_price(): Digital option pricing
bs_digital_delta(): Digital option delta
bs_call_price(): Vanilla call pricing
bs_call_gamma(): Vanilla call gamma

`simulation.py`

Monte Carlo simulation:

simulate_bs_terminal(): Single-step Black-Scholes simulation
simulate_bs_two_step(): Two-step simulation for barriers

`training.py`

Training utilities:

train_model(): Generic training loop with multiple modes
- "standard": Price only
- "delta_pathwise": Price + pathwise delta
- "delta_lrm": Price + LRM delta
- "gamma_pwlr": Price + delta + gamma
nn_value_delta_gamma(): Compute NN outputs and derivatives
rmse(): Root mean squared error metric

`losses.py`

Loss functions for DML:

dml_loss(): Combined loss with price, delta, and gamma terms
DifferentialLoss: Modular loss class
AdaptiveDifferentialLoss: Dynamic weight adjustment
HuberDifferentialLoss: Robust to outliers

Dataset Generators

Each dataset function returns (features, prices, deltas, [gammas]):

make_digital_dataset(): Digital options with LRM deltas
make_barrier_dataset(): Down-and-out calls with two-step simulation
make_basket_digital_dataset(): Multi-dimensional Bachelier model
make_smoothed_digital_dataset(): Ramp-smoothed payoffs
make_portfolio_gamma_dataset(): Portfolio with analytical gammas

🛠️ Development

Testing

Run the test suite:

# All tests
poetry run pytest

# With coverage report
poetry run pytest --cov=diffml

# Quick tests only (skip slow ones)
poetry run pytest -m "not slow"

# Specific test file
poetry run pytest tests/test_bs_analytics.py

Code Quality

# Type checking
poetry run mypy src tests

# Linting
poetry run ruff check .

# Formatting
poetry run black src tests

# Pre-commit hooks
pre-commit run --all-files

Building Documentation

# Generate API documentation
poetry run pdoc --html --output-dir docs src/diffml

📊 Results

Experiments demonstrate that Differential ML consistently:

Reduces training time by 5-10x compared to standard ML
Improves Greek accuracy by an order of magnitude
Generalizes better to out-of-sample data
Scales efficiently to high-dimensional problems

Example results for digital options:

Model	Price RMSE	Delta RMSE
Standard ML	0.0234	0.1823
Pathwise DML	0.0198	0.1654
LRM DML	0.0156	0.0421

📝 Citation

If you use this code in your research, please cite:

@software{ribeiro2024diffml,
  title = {DiffML: Differential Machine Learning Implementation},
  author = {Ribeiro, Diogo},
  year = {2024},
  url = {https://github.com/diogoribeiro7/diffml},
  version = {0.1.0}
}

@article{glasserman2025differential,
  title = {Differential ML with a Difference},
  author = {Glasserman, Paul and Karmarkar, Siddharth Hemant},
  journal = {arXiv preprint},
  year = {2025},
  doi = {10.48550/arXiv.2512.0530},
  url = {https://doi.org/10.48550/arXiv.2512.0530}
}

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Quick Contribution Guide

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes
Run tests (poetry run pytest)
Commit with descriptive message
Push to your fork
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Paul Glasserman and Siddharth Hemant Karmarkar for the original research
PyTorch team for the excellent deep learning framework
The quantitative finance and machine learning communities

📧 Contact

Diogo Ribeiro Email: dfr@esmad.ipp.pt GitHub: @diogoribeiro7 ORCID: 0009-0001-2022-7072

This implementation is for educational and research purposes. Use in production systems should be thoroughly validated.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github		.github
configs		configs
docs		docs
examples		examples
notebooks		notebooks
paper		paper
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
README_NEW.md		README_NEW.md
RELEASE.md		RELEASE.md
SECURITY.md		SECURITY.md
dashboard.py		dashboard.py
docker-compose.yml		docker-compose.yml
labels.txt		labels.txt
mypy.ini		mypy.ini
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
run_tests.py		run_tests.py
setup.cfg		setup.cfg
setup.py		setup.py
sync_labels.bat		sync_labels.bat
sync_labels.ps1		sync_labels.ps1

Folders and files

Latest commit

History

Repository files navigation

DiffML: Differential Machine Learning Implementation

📚 Table of Contents

🎯 Overview

✨ Key Features

📦 Installation

Prerequisites

Setup

🚀 Quick Start

Configuration-Based Experiments (Recommended)

Run All Experiments

Benchmark & Sensitivity Analysis

Path-Dependent Experiments

Run Individual Experiments (Programmatic)

Custom Usage Example

Unified Simulator API Example

📁 Project Structure

🧪 Experiments

1. Digital Options

2. Barrier Options

3. Basket Options

4. Smoothing Techniques

5. Gamma Portfolio

📖 API Documentation

Core Modules

config.py

networks.py

bs_analytics.py

simulation.py

training.py

losses.py

Dataset Generators

🛠️ Development

Testing

Code Quality

Building Documentation

📊 Results

📝 Citation

🤝 Contributing

Quick Contribution Guide

📄 License

🙏 Acknowledgments

📧 Contact

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`config.py`

`networks.py`

`bs_analytics.py`

`simulation.py`

`training.py`

`losses.py`

Packages