-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
Description
TODO:
After rebuttal
- License (MIT) and Max Planck Society
Alex
-
Read papers GP-SSM
- learn a dynamics model, use the model for exploration strategy, maybe update more than one points
- identify structure, use this structure to make some assumptions on the safety function
- main point of the paper
-
general rewrite of GP approximation (3.2)
-
redo figures
-
4-dimensional system
-
Redo 2-dimensional experiments
-
IAV affiliation
-
Matthias comments
-
Mention noise in discussion
-
Read new papers of reviewers
Steve
- proof extension, based on Bellman update equations
- acknowledgments
- rewrite conclusions
- Paper by Kirchner (ETH)
- State-dependent uncertainty
- replacing old samples ("closeness")
- mixing in dynamics
- General improvements based on reviews
prep for rebuttal
- Alex get working example with the 4-D spaceship model: 02.08
- Test convergence
- Test existing 2d examples
- Steve implement spaceship model with 2-D action space: 29.07
- Alex get working example with the 5-D spaceship model: Not needed anymore?
- Alex handle the last comment from Matthias
prep for Sept 7
- Add commentary in conclusions: determinitic dynamics assumption is theoretically not required, though we have not investigated this and expect that practical complications of interest will arise.
- Split into modules
- models and viability Steve
- GP learning Alex (also merge in submission branch)
- label submission version of CoRL. Add in LaTeX files of the paper
- Obtain better graphs
- figures... we are not always converging to a safe subset
- With multiple trajectories on the parameters, and get a nice convergence
- Other types of graphs? In suppl. material? Comparison with Random Search, convergence/iterations, and failure rate. Alex
- Comparison with cost-function not doing this
- Clean up code
- remove viability computations for warm-start in
estimate_measure. Q_V, Q_M etc. should be calculated by the user outside, and then passed to the learning class. Classes implemented inmeasureshould not depend onviability - data going into the sampler class... what does this contain? It shouldn't require any ground-truth data...
- string together trajectories low priority
- test function to run a bunch of trials with uniform random sampling {Steve, Alex}
- 3D example look up
- 5D example look up
- Acrobot example? low
- remove viability computations for warm-start in
- Rewrite
-
Point out notation Steve
-
Point out examples is in the suppl. code
-
Better colormaps Alex Use hatching for ground-truth, color for learned stuff
-
- Appendix, with descriptions of additional examples
- convergence proof in appendix See rebuttal
Deadline
-
Train GP hyperparameters with failures and infeasable points
-
Rewrite to be able to include different models
-
Arbitrary dynamics 2d
-
Q-Feas?
-
Arbitrary dynamics more-d
-
states undiscrete
-
plots
-
clean up code for submission
- all examples of figures used in paper
- bonus RL within the safe set
-
intro to GPs in 3 sentences
-
re-iterate on related work
-
do the extra models