Hybrid PPO implementation using stable-baselines3 and benchmarked in the Gymnasium-Hybrid standard environment.
Test demo:
| SB3-HPPO in Moving-v0 โฌ | SB3-HPPO in Sliding-v0 โฌ |
|---|---|
![]() |
![]() |
git clone https://github.com/Jordan-Haidee/sb3-hppo.git
cd path/to/sb3ppouv sync # recommended
# or `pip install requirements.txt`$ python train.py --help
usage: train.py [-h] [OPTIONS]
โญโ options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ -h, --help show this help message and exit โ
โ --env STR env id from gymnasium_hybrid (Moving-v0 / Sliding-v0 / HardMove-v0) (default: Moving-v0) โ
โ --n-envs INT number of parallel environments (default: 8) โ
โ --seed INT random seed (default: 42) โ
โ --save-path {None}|PATH โ
โ path to save model and logs (default: None) โ
โ --total-timesteps INT total timesteps to train (default: 5000000) โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏExample:
python train.py --env Moving-v0 # or Sliding-v0, HardMove-v0The trained policy and tensorboard log will be saved at output/sb3hppo_xxx/model.zip and output/sb3hppo_xxx/tb_log, respectively.
$ python .\test.py --help
usage: test.py [-h] [OPTIONS]
โญโ options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ -h, --help show this help message and exit โ
โ --env STR Env id from gymnasium_hybrid (Moving-v0 / Sliding-v0 / HardMove-v0) (required) โ
โ --ckpt PATH Path to the checkpoint file (*.zip) (required) โ
โ --render, --no-render Whether to render the environment (default: False) โ
โ --save-video {None}|PATH โ
โ Path to save the video (None to disable) (default: None) โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
Example:
python test.py --env "Moving-v0" --ckpt output/sb3hppo_Moving-v0_20250515_114301/model.zip --renderThanks to @wild-firefox and @CAI23sbP ! This repo heavily depends on their preceding works:


