-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathTODO-IDEA.txt
More file actions
executable file
·41 lines (28 loc) · 1.14 KB
/
TODO-IDEA.txt
File metadata and controls
executable file
·41 lines (28 loc) · 1.14 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
- Compare SGD/Adam (SGD should be better for testing)
- Statistics:
- Keep track of memory usage
- Save videos of train and test
- Save curves
- Save/load
- Save model during training. -> Done (Adrien)
- Run Q-learning OR Q-learning+ by just changing one boolean parameter
- Implements Deep Q learning+ (Step 3+)
- Fixed Q-target. -> Done. (Lostindark); Question : How to find 'optimal' tau ?
- double DQN. -> Done. (Lost)
- dueling DQN. -> Done. (Lost)
- PER. -> Done. (Lost); Question : Optimal a ?
- Lauch code with parameters (Scripts)
- Try new architectures
- Try other games
References :
Tutorial link:
https://simoninithomas.github.io/Deep_reinforcement_learning_Course/#syllabus
Reference paper:
chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/https://nihit.github.io/resources/spaceinvaders.pdf
Another DQN implementation:
https://github.com/tambetm/simple_dqn
More advanced tuto on RL:
https://www.youtube.com/channel/UCP7jMXSY2xbc3KCAE0MHQ-A/videos
About Bellman equations :
https://joshgreaves.com/reinforcement-learning/understanding-rl-the-bellman-equations/