Code for physically embedded single-player board game environments (used to evaluate the SEADS agent)

The environments in this repository were presented, alongside the hierarchical reinforcement learning agent SEADS, in the publication:

Achterhold, Jan and Krimmel, Markus and Stueckler, Joerg:
Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning
6th Annual Conference on Robot Learning 2022 (CoRL 2022)

Project page: https://seads.is.tue.mpg.de/
Full paper: https://openreview.net/forum?id=t-IO7wCaNgH

If you use the code, data or models provided in this repository for your research, please cite our paper as:

@inproceedings{
    achterhold2022learning,
    title={Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning},
    author={Jan Achterhold and Markus Krimmel and Joerg Stueckler},
    booktitle={6th Annual Conference on Robot Learning (CoRL)},
    year={2022},
    url={https://openreview.net/forum?id=t-IO7wCaNgH}
}

Please find the implementation of the SEADS agent at https://github.com/EmbodiedVision/seads-agent.



`LightsOutCursorEnv`	`TileSwapCursorEnv`	`LightsOutReacherEnv`	`TileSwapReacherEnv`	`LightsOutJacoEnv`	`TileSwapJacoEnv`

This directory contains physically embedded single-player board game environments. The idea of embedding board games into physical manipulation scenarios was introduced by Mirza et al., 2020 [1]. We propose embedded single-player board games.

Installation

Install git-lfs

sudo apt install git-lfs

Clone repository

git clone https://github.com/EmbodiedVision/seads-environments

Update wheel/pip/setuptools

pip install -U pip setuptools wheel

Install package

cd seads-environments; python check_hdf.py; pip install .

Set environment variables

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<MUJOCO_200_DIR>/bin/
export MJKEY_PATH=<PATH_TO_MJKEY.TXT>
export MJLIB_PATH=<MUJOCO_200_DIR>/bin/libmujoco200.so
export MUJOCO_GL="osmesa"

Run tests to check installation

python -I -m unittest seads_envs/test_hybrid.py

Usage

We provide two board games (LightsOut, TileSwap) with three different physical manipulation embeddings (Cursor, Reacher, Jaco). You can directly instantiate the LightsOutCursor, TileSwapCursor, LightsOutReacher, TileSwapReacher, LightsOutJaco, TileSwapJaco environments via the respective classes in seads_envs.hybrid.cursor, seads_envs.hybrid.reacher, seads_envs.hybrid.jaco.

Common parameters

All environments share the following parameters:

Parameter name	Description
`reset_split`	Split of the initial board configuration (`train` or `test`).
`max_solution_depth`	Maximum solution depth of initial board configuration (number of steps required to solve the board).
`board_size`	(One-sided) board size, e.g. 5 for `LightsOut` and 3 for `TileSwap`.
`random_solution_depth`	Randomly sample actual solution depth from {1, ..., `max_solution_depth`}. If `False`, always use `max_solution_depth` (default).
`done_if_solved`	Set `done` flag only when board is solved; if `False`, set `done` flag whenever the board state has changed (default). If you want to train a non-hierachical agent, set `done_if_solved=True`.

Distinct parameters

Some environments possess distinct parameters, the most important ones are explained below. For a exhaustive description, see the docstrings of the respective environments.

Parameter name (environments)	Description
`mixed_action_space` (`Cursor`, `Reacher`)	If `True`, use a binary action for indicating that the field below the end effector is pushed. If `False`, use a continuous variable and evaluate if it exceeds a threshold (default).
`toggle_by_halffield` (`LightsOutCursor`)	If `True`, fields can only be switched on when pushing on the upper half, and only be switched off when pushing on the lower half. Default: `False`.
`rendering_enabled` (`*Jaco`)	If `False`, environment rendering is disabled. This saves computation, as otherwise during each environment step, a rendering is prepared for a potential call of `render`. Default: `False`.
`stairs` (`LightsOutJaco`)	If `True`, elevate the `LightsOut` fields (`LightsOut3DJaco` environment).

Alternative instantiation

You may also instantiate the environments via seads_envs.load_env. Here, you need to pass the environment name, following the scheme <board_game><manipulator>BsX, e.g. LightsOutReacherBs5, where the last digit indicates the board size. In our experiments we only use a boardsize of 5 for all LightsOut environments and a boardsize of 3 for all TileSwap environments. Optionally, you can pass a time limit for the environments (time_limit) and augment the observations by a normalized remaining time (ext_timelimit_obs).

Solution depth and train/test split

By solution depth we refer to the number of steps required to bring a particular board into its target configuration (all lights of in LightsOut, ordered tiles in TileSwap). To compute board configurations for particular solution depths, we use a reversed breadth-first search (starting from the target board configuration), see seads_envs/board_games/bfs_reverse.py. All boards are classified into train and test splits using a hash function. The actual number of boards of particular solution depth per split can be displayed with seads_envs/board_games/dataset_size_analyze.py.

Visualization

For a visualization of the proposed environments, see the notebooks in the visualization subdirectory. The notebook at visualization/embedded_environments.ipynb exemplarily instantiates all embedded environments.

License

See LICENSE and NOTICE.

For license information on 3rd-party software we use in this project, see NOTICE. To comply with the Apache License, Version 2.0, any Derivative Works that You distribute must include a readable copy of the attribution notices contained within the NOTICE file (see LICENSE).

References

[1] M. Mirza, A. Jaegle, J. J. Hunt, A. Guez, S. Tunyasuvunakool, A. Muldal, T. Weber, P. Karkus, S. Racanière, L. Buesing, T. P. Lillicrap, and N. Heess. “Physically Embedded Planning Problems: New Challenges for Reinforcement Learning”. In: CoRR abs/2009.05524 (2020). arXiv: 2009.05524. URL: https://arxiv.org/abs/2009.05524.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
seads_envs		seads_envs
visualization		visualization
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
check_hdf.py		check_hdf.py
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Code for physically embedded single-player board game environments (used to evaluate the SEADS agent)

Installation

Usage

Common parameters

Distinct parameters

Alternative instantiation

Solution depth and train/test split

Visualization

License

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

EmbodiedVision/seads-environments

Folders and files

Latest commit

History

Repository files navigation

Code for physically embedded single-player board game environments (used to evaluate the SEADS agent)

Installation

Usage

Common parameters

Distinct parameters

Alternative instantiation

Solution depth and train/test split

Visualization

License

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages