A simple implementation in Pytorch of vanilla DQN model, featured in the paper Playing Atari with Deep Reinforcement Learning (2013) arXiv:1312.5602
Python version: 3.9.15
Default replay memory might takes up a lot of memory, ~20 GBs
Dependencies:
| Package | Version | Installation note |
|---|---|---|
| gym | 0.21.0 | N/A |
| torch | 2.1.0.dev20230526 | refer to official installation page |
| tensorboard | 2.11.0 | N/A |
| matplotlib | 3.6.2 | N/A |
** No installation note means package can simply be installed via pip
Model after 2,500,000 training steps (Learning rate=0.00025)
pong_demo.mp4
Model after 24,000,000 training steps (Learning rate=0.000025, wihout penalizing lose lives)
Note: The model prone to perform random actions instead of FIRE (to reset game) after losing lives (which was cut out by force reset)
breakout_nopen_demo.mp4
Model (Learning rate=0.000025, with penalizing lose lives) (to be updated)
N/A