Understanding and Improving Hyperbolic Deep Reinforcement Learning

🇧🇷 Our paper has been accepted at ICLR 2026! 🇧🇷

Installation

Install uv: Download from https://github.com/astral-sh/uv or install via:
```
curl -LsSf https://astral.sh/uv/install.sh | sh
```
Create virtual environment:
```
uv venv
```
Install dependencies:
```
uv sync
```

Make scripts executable:

chmod +x run_atari_paper.sh run_procgen_paper.sh

Wandb tracking:

Replace your-entity in all config files with your wandb entity name:
```
find config -name "*.yaml" -exec sed -i 's/your-entity/YOUR_ENTITY_NAME/g' {} +
```
Replace YOUR_ENTITY_NAME with your actual wandb entity.

Running Experiments

Main Paper Results

Atari experiments: ./run_atari_paper.sh
Procgen experiments: ./run_procgen_paper.sh

Configuration Folders

`config/atari_paper/`

Contains configs for Atari benchmark with DDQN:

hyperpp.yaml - HYPER++ (ours): Hyperboloid model with RMSNorm, learned scaling, and categorical loss
hyper_paper.yaml - Hyper+S-RYM: Poincaré Ball with SpectralNorm and 1/√d scaling
euclidean.yaml - Euclidean baseline: Standard Euclidean representations

To run a specific config:

uv run run_ddqn.py -cd=config/atari_paper -cn=hyperpp experiment.gpu=0 env_id="NameThisGameNoFrameskip-v4"

Atari-5 environments: NameThisGameNoFrameskip-v4, PhoenixNoFrameskip-v4, BattleZoneNoFrameskip-v4, QbertNoFrameskip-v4, DobleDunkNoFrameskip-v4

`config/procgen_paper/`

Contains configs for all 16 ProcGen environments with PPO:

hyperpp.yaml - HYPER++ (ours)
hyper_paper.yaml - Hyper+S-RYM
euclidean_baseline.yaml - Euclidean baseline

To run a specific config:

uv run run_ppo.py -cd=config/procgen_paper -cn=hyperpp experiment.gpu=0 env_id=bigfish

Available Procgen environments: bigfish, bossfight, caveflyer, chaser, climber, coinrun, dodgeball, fruitbot, heist, jumper, leaper, maze, miner, ninja, plunder, starpilot

`config/ppo_ablations/`

Ablation studies for individual HYPER++ components:

procgen_hyperpp.yaml - Full HYPER++ (baseline for ablations)
procgen_hyperpp_no_rms.yaml - Removing RMSNorm
procgen_hyperpp_noscale.yaml - Removing learned scaling
procgen_hyperpp_nohlgauss.yaml - Using MSE instead of categorical loss
procgen_hyperpp_poincare.yaml - Using Poincaré Ball instead of Hyperboloid

uv run run_ppo.py -cd=config/ppo_ablations -cn=procgen_hyperpp_no_rms experiment.gpu=0 env_id=bigfish

Key Parameters to Modify

Common parameters you might want to adjust:

Environment & GPU:

env_id=<name> - Change environment (see lists above)
experiment.gpu=<id> - GPU device ID (e.g., 0, 1, 2, etc.)
experiment.track=<bool> - Activate/deactivate wandb tracking

Training:

num_envs=<int> - Number of parallel environments (default: 64 for Procgen)
total_timesteps=<int> - Training duration (default: 25M for Procgen, 10M for Atari)

Citation

If you use our code for your research, please cite our paper:

@article{klein2025hyperrl,
  title={Understanding and Improving Hyperbolic Deep Reinforcement Learning},
  author={Klein, Timo and Lang, Thomas and Shkabrii, Andrii and Sturm, Alexander and Sidak, Kevin and Miklautz, Lukas and Velaj, Yllka and Plant, Claudia and Tschiatschek, Sebastian},
  journal={arXiv preprint arXiv:2512.14202},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
config		config
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_atari_paper.sh		run_atari_paper.sh
run_ddqn.py		run_ddqn.py
run_ppg.py		run_ppg.py
run_ppo.py		run_ppo.py
run_procgen_paper.sh		run_procgen_paper.sh
setup.py		setup.py
tuner.py		tuner.py
uv.lock		uv.lock
wall_clock.ipynb		wall_clock.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Understanding and Improving Hyperbolic Deep Reinforcement Learning

Installation

Running Experiments

Main Paper Results

Configuration Folders

`config/atari_paper/`

`config/procgen_paper/`

`config/ppo_ablations/`

Key Parameters to Modify

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Understanding and Improving Hyperbolic Deep Reinforcement Learning

Installation

Running Experiments

Main Paper Results

Configuration Folders

config/atari_paper/

config/procgen_paper/

config/ppo_ablations/

Key Parameters to Modify

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`config/atari_paper/`

`config/procgen_paper/`

`config/ppo_ablations/`

Packages