Cogment Verse

Cogment Verse is a SDK helping researchers and developers in the fields of human-in-the-loop learning (HILL) and multi-agent reinforcement learning (MARL) train and validate their agents at scale. Cogment Verse instantiates the open-source Cogment platform for environments following the OpenAI Gym mold, making it easy to get started.

Simply clone the repo and start training.

Documentation table of contents

Getting started
Tutorials
- Simple Behavioral Cloning
Develop
Deploy
- Tunnel unsing ngrok
Experimental results 🚧
- A2C
- REINFORCE
Changelog
Contributors guide
Community code of conduct

Getting started

The following will show you how to setup Cogment Verse locally, it is possible to use a Docker based setup instead. Instructions for this can be found here

Clone this repository
Install Python 3.9
Depending on your specific machine, you might also need to following dependencies:
- swig, which is required for the Box2d gym environments, it can be installed using apt-get install swig on ubuntu or brew install swig on macOS
- python3-opencv, which is required on ubuntu systems, it can be installed using apt-get install python3-opencv
- libosmesa6-dev and patchelf are required to run the environment libraries using mujoco. They can be installed using apt-get install libosmesa6-dev patchelf.

Create and activate a virtual environment

$ python -m venv .venv
$ source .venv/bin/activate

Install the python dependencies.
```
$ pip install -r requirements.txt
```
Depending on the environment you want to use, you might need to take additional steps.

In another terminal, launch a mlflow server on port 3000

$ source .venv/bin/activate
$ python -m simple_mlflow

Start the default Cogment Verse run using python -m main
Open Chrome (other web browser might work but haven't tested) and navigate to http://localhost:8080/
Play the game!

That's the basic setup for Cogment Verse, you are now ready to train AI agents.

Configuration

Cogment Verse relies on hydra for configuration. This enables easy configuration and composition of configuration directly from yaml files and the command line.

The configuration files are located in the config directory, with defaults defined in config/config.yaml.

Here are a few examples:

Launch a Simple Behavior Cloning run with the Mountain Car Gym environment (which is the default environment)
```
$ python -m main +experiment=simple_bc/mountain_car
```

Launch a Simple Behavior Cloning run with the Lunar Lander Gym environment

$ python -m main +experiment=simple_bc/mountain_car services/environment=lunar_lander

Launch and play a single trial of the Lunar Lander Gym environment with continuous controls
```
$ python -m main services/environment=lunar_lander_continuous
```
Launch an A2C training run with the Cartpole Gym environment
```
$ python -m main +experiment=simple_a2c/cartpole
```
This one is completely headless (training doens't involve interaction with a human player). It will take a little while to run, you can monitor the progress using mlflow at http://localhost:3000
Launch an DQN self training run with the Connect Four PettingZoo environment
```
$ python -m main +experiment=simple_dqn/connect_four
```
The same experiment can be launched with a ratio of human-in-the-loop training trials (that are playable on in the web client)
```
$ python -m main +experiment=simple_dqn/connect_four +run.hill_training_trials_ratio=0.05
```
PettingZoo's Atari Pong Environment

Example #1: Play against RL agent
```
$ python -m main +experiment=ppo_atari_pz/play_pong_pz
```
Example #2: Observing RL agents playing against each other
```
$ python -m main +experiment=ppo_atari_pz/observe_play_pong_pz
```
Example #3: Training with human's demonstrations
```
$ python -m main +experiment=ppo_atari_pz/hill_pong_pz
```
Example #4: Training with human's feedback
```
$ python -m main +experiment=ppo_atari_pz/hfb_pong_pz
```
Example #5: Self-training
```
$ python -m main +experiment=ppo_atari_pz/pong_pz
```
NOTE: Example 2&3 require users to open Chrome and navigate to http://localhost:8080 in order to provide either demonstrations or feedback.

List of publications and submissions using Cogment and/or Cogment Verse

Analyzing and Overcoming Degradation in Warm-Start Off-Policy Reinforcement Learning code
Multi-Teacher Curriculum Design for Sparse Reward Environments code

(please open a pull request to add missing entries)

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
actors		actors
cloud		cloud
cogment_verse		cogment_verse
config		config
docs		docs
environments		environments
getting_started		getting_started
runs		runs
tests		tests
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.license.yaml		.license.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yaml		docker-compose.yaml
isaac_requirements.txt		isaac_requirements.txt
launch_local_services.py		launch_local_services.py
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
simple_mlflow.py		simple_mlflow.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cogment Verse

Documentation table of contents

Getting started

Configuration

List of publications and submissions using Cogment and/or Cogment Verse

About

Uh oh!

Releases

Uh oh!

Contributors 12

Uh oh!

Languages

License

cogment/cogment-verse

Folders and files

Latest commit

History

Repository files navigation

Cogment Verse

Documentation table of contents

Getting started

Configuration

List of publications and submissions using Cogment and/or Cogment Verse

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors 12

Uh oh!

Languages