In this environment, two agents play tennis table. The agents control their rackets to bounce a ball over the net. Every agent hits the ball over the net receives a reward of +0.1, and a reward -0.01 if the agent lets the ball hits the ground or hits the ball out of the bounds. The goal of every agent is to maintain the ball in play.
The observation space consists of 8 variables corresponding to position and velocity of the ball and the racket. Each agent receives its own local observation of the environment, and has two actions available corresponding to movement toward (or away from) the net and jumping.
The task is episodic. To solve the enviornment, the agents must get an average score of at least +0.5 over 100 consecutive episodes after taking the maximum score over both agents.
Download the MultiAgent repository from the top-right button. You can also clone the repository and downloaded from a terminal in your workspace directory using the following command line:
git clone https://github.com/OlaAhmad/MultiAgent.git
To train and test the agents, go to the MultiAgent folder and open the Tennis.ipynb notebook:
cd MultiAgent
Jupyter notebook Tennis.ipynb
When runing the notebook, the actor-critic agent will start training over 3000 episodes; The actor-critic networks will start training and simultaneously update their parameters every time step. The updated parameters of the trained acrchitectures are saved in checkpoint files. The agents accomplished a score of > +0.5 over 100 consecutive episodes.
The repository contains the following codes:
- model.py: builds actor and critic neural network architectures.
- MADDPG.py: builds the multi agents class based on the DDPG algorithm.
- Tennis.ipynb: notebook to train DDPG multi agents over 3000 episodes.
- Trained actor and critic networks saved in .pth formate.
- udacity/deep-reinforcement-learning

