Yuanlin Duan, Guofeng Cui and He Zhu
Code for "Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning" (NeurIPS 2024), a method to do efficent exploration for GCRL.
If you find our paper or code useful, please reference us:
@article{duan2024exploring,
title={Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning},
author={Duan, Yuanlin and Cui, Guofeng and Zhu, He},
journal={arXiv preprint arXiv:2411.01396},
year={2024}
}
CE2 learns the latent space by temporary distance estimator, reflecting the reachability of states in latent space. Then CE2 dose clustering to group states that are easily reachable from one another by the current policy under the latent space and traversing to states holding significant exploration potential on the boundary of these clusters before doing exploratory behavior. CE^2 selects goal states at the edges of these latent state clusters for exploration because (1) less explored regions are naturally adjacent to these boundaries. (2) given the easy accessibility between states within each cluster by the training policy, the agent’s capability extends to reaching states even at the cluster boundaries.
CE2 outperforms other exploration approaches across a variety of tasks.
CE2/
|- Config/ # config file for each environment.
|- dreamerv2/ # CE2 implement
|- dreamerv2/gc_main.py # Main running file
pip intall all dependencies:
pip install -r library.txtAnd then, run:
pip install -e .We evaluate CE2 on six environments: Ant Maze, Point Maze, Walker, 3-block Stacking, Block Rotation, Pen Rotation.
MUJOCO install: MuJoCo 2.0
Ant Maze, Point Maze, 3-Block Stack environments:
The mrl codebase contains Ant Maze, Point Maze, and 3-block Stack environments.
git clone https://github.com/hueds/mrl.gitBefore testing these three environments, you should make sure that the mrl path is set in the PYTHONPATH.
export PYTHONPATH=<path to your mrl folder>Walker environment:
Clone the lexa-benchmark and dm_control repos.
git clone https://github.com/hueds/dm_control
git clone https://github.com/hueds/lexa-benchmark.gitSet up dm_control as a local python module:
cd dm_control
pip install .Set LD_PRELOAD to the libGLEW path, and set the MUJOCO_GL and MUJOCO_RENDERER variables.
# if you want to run environments in the lexa-benchmark codebase
MUJOCO_GL=egl MUJOCO_RENDERER=egl LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so:/usr/lib/x86_64-linux-gnu/libGL.so PYTHONPATH=<path to your lexa-benchmark folder like "/home/edward/lexa-benchmark">Training Scripts:
python dreamerv2_APS/gc_main.py --configs RotatePen(environment name in config file) --logdir "your logdir path"Use the tensorboard to check the results.
tensorboard --logdir ~/logdir/your_logdir_nameCE2 builds on many prior works, and we thank the authors for their contributions.

