Check the enhancements here: https://github.com/hill-a/stable-baselines/issues/821 Additionally, check: https://seohong.me/blog/q-learning-is-not-yet-scalable/