MAPPO

Two Multi-Agent Reinforcement Learning algorithms implemented in PyTorch and applied on the KAZ PettingZoo environment.

Reference: https://arxiv.org/abs/2103.01955

Environment: https://pettingzoo.farama.org/environments/butterfly/knights_archers_zombies/

Note: pretrained.zip contains the code for an agent trained on the single-agent version of KAZ, it is possible to initialize the multi-agent algorithms either from scratch, or using this pretrained agent's weights.

It is possible to show that MAPPO outperforms IPPO when intialized from scratch, and IPPO converges faster when initialized with the pretrained weights.

This project implements two algorithms and compares them on the PettingZoo KAZ environment, which is a cooperation problem.

Usage:

Use the train_*.py scripts to train either agent, and visualize.py to see them play the KAZ game.

Independent Proximal Policy Optimization (IPPO)

Each agent maintains its own actor and critic networks, and does not directly take into account the behavior of the other agent to take its own decisions.

Multi-Agent Proximal Policy Optimization (MAPPO)

Each agent still learns its own actor network, but we have a centralized critic, that judges the action of an agent by taking into account the action of the other agent.

The critic's network architecture in this case is shown in the diagram below.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
mappo		mappo
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
centralized_critic.png		centralized_critic.png
env.py		env.py
pretrained.zip		pretrained.zip
train_ippo.py		train_ippo.py
train_mappo.py		train_mappo.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MAPPO

Usage:

Independent Proximal Policy Optimization (IPPO)

Multi-Agent Proximal Policy Optimization (MAPPO)

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

mrochk/MAPPO

Folders and files

Latest commit

History

Repository files navigation

MAPPO

Usage:

Independent Proximal Policy Optimization (IPPO)

Multi-Agent Proximal Policy Optimization (MAPPO)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages