Dueling DDPG for American Put Hedging Under Transaction Costs

A PyTorch implementation of a Dueling Deep Deterministic Policy Gradient (Dueling-DDPG) agent for discrete-time hedging of an American put option, benchmarked against a classical delta-hedging baseline.

Project timeline: Oct 2024 – Oct 2025 (research + implementation).
Repository: this repo is a cleaned, publishable snapshot of the final project.

Why this matters

Hedging with real transaction costs is a self-adaptive control problem:

Delta hedging can over-trade under frictions.
An RL policy can learn to adapt rebalancing to market dynamics and costs.

This repo focuses on measurable outcomes:

Expected P&L
Risk (P&L variance / dispersion)
Stability under trading frictions

Problem setup

Pathwise simulation of the underlying
American put option
365 hedging steps (discrete time)
Continuous action space (bounded delta)
Proportional trading cost: 0% and 3%
Baseline: delta hedge from the option pricer

Training & evaluation protocol

Training uses exploration noise.
Evaluation is noise-free (deterministic policy) for apples-to-apples comparison.

We track:

Rolling training P&L (last 100 episodes)
Periodic evaluation every 500 episodes
Final Monte Carlo evaluation (1000 simulations)

Policy stabilization is defined as: evaluation mean and standard deviation plateau across successive evaluation checkpoints.

Results (final evaluation)

Trading cost = 0%

RL Mean: 4.4491 | Delta Mean: 4.4356
RL Variance: 2.4968 | Delta Variance: 4.4696
Variance reduction: ~44.1%

Interpretation:

RL achieves comparable expected P&L.
RL reduces variance significantly relative to delta hedge.

Trading cost = 3%

RL Mean: 3.1523 | Delta Mean: 0.2610 (+2.8913 improvement)
RL Variance: 2.8043 | Delta Variance: 5.7982
Variance reduction: ~51.6%

Interpretation:

RL significantly outperforms delta hedge in mean P&L.
RL reduces variance substantially.
Delta hedge over-trades under friction; RL adapts rebalancing frequency.

Visual results

P&L distribution (0%)

P&L distribution (3%)

Training loss (0%)

Training loss (3%)

Repository structure

src/ — training + evaluation entrypoint (main.py) and core implementation
assets/ — plots shown in this README
requirements.txt — Python dependencies (see notes below)

Quickstart

1) Create environment

Windows (PowerShell):

python -m venv .venv
.\.venv\Scripts\activate
pip install -r requirements.txt

macOS/Linux:

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

2) Run

cd src
python main.py

Reproducibility notes

If you’re on CPU-only or a different CUDA version, you may need to install PyTorch separately (per the official PyTorch install selector) and then install the remaining packages.
For stricter reproducibility, consider keeping:
- requirements.txt (minimal runtime deps)
- requirements-lock.txt (fully pinned pip freeze)

License

See LICENSE.

Citation

If you reference this work, please cite this repository and/or link to it.

Author: Neal (Zhiheng) Song

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dueling DDPG for American Put Hedging Under Transaction Costs

Why this matters

Problem setup

Training & evaluation protocol

Results (final evaluation)

Trading cost = 0%

Trading cost = 3%

Visual results

P&L distribution (0%)

P&L distribution (3%)

Training loss (0%)

Training loss (3%)

Repository structure

Quickstart

1) Create environment

2) Run

Reproducibility notes

License

Citation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Dueling DDPG for American Put Hedging Under Transaction Costs

Why this matters

Problem setup

Training & evaluation protocol

Results (final evaluation)

Trading cost = 0%

Trading cost = 3%

Visual results

P&L distribution (0%)

P&L distribution (3%)

Training loss (0%)

Training loss (3%)

Repository structure

Quickstart

1) Create environment

2) Run

Reproducibility notes

License

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages