This project trains a DQN agent to navigate in a large square enviornment to collect yellow bananas while avoiding blue bananas.
The envioenment contains yellow and blue bananas; during training the agent gets reward = +1 if succesfully collected a yellow banana and a reward = -1 if collected a blue banana.
The state space has 37 dimensions and contains the agent's velocity, along with ray-based perception of objects around the agent's forward direction. Given this information, the agent has to learn how to best select actions through an episodic task.
Four discrete actions are available:
- 0: move forward
- 1: move backward
- 2: turn left
- 3: turn right
Download the Navigation repository from the top-right button. You can also clone the repository and downloaded from a terminal in your workspace directory using the following command line:
git clone https://github.com/OlaAhmad/Navigation.git
Go to the Navigation folder and open the Navigation notebook to train the DQN agent as follows:
cd Navigation
Jupyter notebook Navigation.ipynb
When runing the notebook, the agent will start training over a number of episodes; it will stop when the episodes get finished or if the average score gets bigger than 16. If so, the trained model will be saved in "model.pth"
In this repository we adapted two python codes that used when running the notebook to train the dqn agent:
- model.py: builds the Q-Network to model the action policy.
- dqn_agent.py: interacts with Banana enviornement and learns the agent from it.
- udacity/deep-reinforcement-learning
