This repository provides clean and robust implementations of common Deep Reinforcement Learning (DRL) algorithms.
If you have any questions about the code, feel free to submit an issue or contact me via my email (available on my homepage).
- ✅ DQN – Implementation complete
- ✅ Double-DQN – Implementation complete
- ✅ Dueling-DQN – Implementation complete
- ✅ Noisy-DQN – Implementation complete
- ✅ DDPG – Implementation complete
- ✅ PPO-Discrete – Implementation complete
- ✅ PPO-Continuous – Implementation complete
- ✅ SAC – Implementation complete
- 🚧 DSAC – In progress
- 🚧 MADDPG – In progress
It is super easy to use our DRL algorithms. If you just want to test the performance, please see 2.3. If you want to train the model yourself, please see 2.4.
Run the following command in your terminal to download this repository to your local.
git clone https://github.com/cloudpetticoats/deep-reinforcement-learning.git
We are using common dependencies, so version issues are unlikely to occur. You can directly use your existing environment. Don't need the same version as me.
But if you encounter any environment issues, here is my environment for reference.
Python 3.9
PyTorch 2.6.0
Gym 0.26.2
Matplotlib 3.9.1
If you don't want to train the model yourself, we have already provided the trained models in the ./models/ folder of each algorithm. You can run test.py directly for visualization testing.
If you want to train the model yourself, go to each algorithm folder and run main.py to train. The trained model will be saved in the ./models/ folder. Then, run test.py to perform visualization testing.
We tested our code in the Gym, and the results are as follows.
|
CartPole-v1(DQN)
|
CartPole-v1(Double-DQN)
|
CartPole-v1(Dueling-DQN)
|
|
CartPole-v1(Noisy-DQN)
|
Pendulum-v1(DDPG)
|
CartPole-v0(PPO-Discrete)
|
|
Pendulum-v1(PPO-Continuous)
|
Pendulum-v1(SAC)
|
- DQN: Mnih, V., Kavukcuoglu, K., Silver, D. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).
- Double DQN: van Hasselt, H., Guez, A. and Silver, D. Deep Reinforcement Learning with Double Q-Learning. Proceedings of the AAAI Conference on Artificial Intelligence. 30, 1 (Mar. 2016).
- Dueling DQN: Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Van Hasselt, Marc Lanctot, and Nando De Freitas. Dueling network architectures for deep reinforcement learning. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48 (ICML).
- Noisy DQN: Fortunato M, Azar M G, Piot B, et al. Noisy networks for exploration. arXiv preprint arXiv:1706.10295, 2017.
- DDPG: Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
- PPO: Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
- SAC: Haarnoja T, Zhou A, Abbeel P, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290, 2018.
- DSAC-T: J. Duan et al., "Distributional Soft Actor-Critic With Three Refinements," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 47, no. 5, pp. 3935-3946, May 2025, doi: 10.1109/TPAMI.2025.3537087.







