This project tackles the challenge of an ego vehicle navigating an uncontrolled four-way intersection where other road users have latent driving styles. SHIELD explores solutions using both Markov Decision Processes (MDPs) and Partially Observable Markov Decision Processes (POMDPs) to enhance the safety and efficiency of autonomous vehicles in these complex scenarios.
In the SUMO simulation environment, these latent driving styles are configured using an "impatience" parameter for each road user. This parameter is set to 1.0 for aggressive drivers and off (or 0) for patient drivers. This key parameter governs their behavior:
- Aggressive (Impatient) Vehicles: These drivers accelerate more aggressively and make more rapid decisions at the intersection.
- Patient (Cautious) Vehicles: These drivers behave more conservatively, with slower acceleration and more cautious decision-making.
This behavioral difference introduces dynamic uncertainty, requiring the ego vehicle to adapt its strategy based on the inferred driving styles of surrounding vehicles.
To run this project, you will need to install the SUMO (Simulation of Urban Mobility) software suite.
- Update Homebrew
brew update
- Install XQuartz
brew install --cask xquartz
- Tap the SUMO repository
brew tap dlr-ts/sumo
- Install SUMO
brew install sumo
After installation, you may need to log out and back in to allow X11 to start automatically when running SUMO with a GUI. Alternatively, you can start XQuartz manually by pressing Cmd+Space and typing "XQuartz".
Finally, make sure to set the SUMO_HOME environment variable by adding the following line to your .zshrc or .bashrc file:
export SUMO_HOME="/opt/homebrew/opt/sumo/share/sumo"This repository contains several Python scripts for running different simulations and training various models. Below are the commands to execute each of them.
To run the simulation with a random policy for the ego vehicle, use the following command:
cd combined_mdp
python random_policy.py To train the Deep Q-Network (DQN) model that considers the ego vehicle and up to 20 other vehicles in a single, high-dimensional MDP, run the following command:
cd combined_mdp
python dqn.pyTo train the Deep Recurrent Q-Network (DRQN) model, which is designed to handle partially observable environments where other vehicles' driving styles are unknown, use the following command:
cd combined_pomdp
python drqn.pyTo evaluate a trained model, you can use the corresponding evaluation script. For example, to evaluate the Combined DQN model, run:
cd combined_mdp
python dqn_eval.pySimilarly, to evaluate the Combined DRQN model, you would run:
cd combined_pomdp
python drqn_eval.py