This is the official research code for the paper Kirsch et al. 2019: "Improving Generalization in Meta Reinforcement Learning using Learned Objectives".
Install the following dependencies (in a virtualenv preferably)
pip3 install ray[tune]==0.7.7 gym[all] mujoco_py>=2 tensorflow-gpu==1.13.2 scipy numpyThis code base uses ray, if you would like to use multiple machines, see the ray documentation for details.
We also make use of ray's native tensorflow ops. Please compile them by running
python3 -c 'import ray; from pyarrow import plasma as plasma; plasma.build_plasma_tensorflow_op()'Adapt the configuration in ray_experiments.py (or use the default configuration) and run
python3 ray_experiments.py trainBy default, this requires a local machine with 4 GPUs to run 20 agents in parallel. Alternatively, skip this and download a pre-trained objective function as described below.
After running meta-training (or downloading a pre-trained objective function)
you can train a new agent from scratch on an environment of your choice.
Optionally configure your training in ray_experiments.py, then run
python3 ray_experiments.py test --objective TRAINING_DIRECTORYThis only requires a single GPU on your machine.
Download a pre-trained objective function,
cd ~/ray_results/metagenrl
curl https://github.com/timediv/metagenrl/releases/download/pretrained-v1/CheetahLunar.tgz|tar xvzand proceed with meta testing as above.
In this case your TRAINING_DIRECTORY will be pretrained-CheetahLunar.
Many tf summaries are written during training and testing and can be visualized with tensorboard
tensorboard --logdir ~/ray_results/metagenrl