stable-baselines with JAX & Haiku
-
Updated
Jun 20, 2024 - Python
stable-baselines with JAX & Haiku
Modelling & Training for a AI-Driven PCB Fault Detection project.
Using DAgger with our MPC treated as the expert, we are able to effectively distill knowledge into relatively simple networks while still being able to retain a large fraction of the performance. (Please see paper for full description).
Berkeley CS 294: Deep Reinforcement Learning
Lunar Lander game from OpenAI Gym using behavioral cloning, DAgger methods, and POMDP(Partially-Observable Markov Decision Processes)
Add a description, image, and links to the dataset-aggregation topic page so that developers can more easily learn about it.
To associate your repository with the dataset-aggregation topic, visit your repo's landing page and select "manage topics."