Contraction Actor-Critic (Contraction Metric Guided Reinforcement Learning for Robust Path-Tracking)
It integrates a CCM into reinforcement learning (RL), where the CCM serves as a critic in learning control policies that minimize cumulative tracking error under unknown dynamics. Given a pre-trained dynamics model, CAC simultaneously learns a contraction metric generator (CMG)—which finds a Riemannian metric under which a system contracts—and uses an actor-critic algorithm to learn an optimal tracking policy guided by that contracting geometry. The code supports multiple benchmark environments—Car, PVTOL, NeuralLander, and Quadrotor—for extensive numerical simulations.
- Minjae Cho - The Grainger College of Engineering, University of Illinois Urbana-Champaign
- Hiroyasu Tsukamoto - The Grainger College of Engineering, University of Illinois Urbana-Champaign
- Huy T. Tran - The Grainger College of Engineering, University of Illinois Urbana-Champaign
If ones have questions about this project, please reach out to Minjae.
In any local folder, open a terminal and run the following command to download our package into the current directory:
git clone https://github.com/Mgineer117/CAC/
cd CAC
We assume that you have Conda installed. If not, please refer to the Anaconda installation guide. Our code is compatible with Python version > 3.10. We recommend creating a dedicated virtual environment as follows:
conda create -n CAC python==3.12.*
conda activate CAC
Then, install the required Python packages using:
pip install -r requirements.txt
We describe our code structure for those who want to develop it further for their research.
├── config **[Contain environment specific parameters for each algorithm]**
├── envs **[Contain simulator for each environments]**
├── log **[Contain logger for both tensorboard and wandb]**
├── model **[Contain Neurallander dynamic parameters and used to load other parameters too]**
├── policy **[Contains the implementation of algorithms]**
│ ├── layers **[Contrains a backbone of algorithms]**
│ ├── base.py
│ ├── c3m.py
│ ├── cac.py
│ ├── lqr.py
│ ├── ppo.py
│ └── sd_lqr.py
├── trainer **[Contain an online trainer for training akgorithms]**
├── utils **[Contain util functions and sampler to collect online data]**
├── README.md
├── requirements.txt
├── evaluate.py [Contain code to evaluate given *.pth model]
├── main.py [Main file to run experiment]
Our codebase uses the following command to regulate the running parameters,
python3 main.py --task car --algo-name cac-approx
or
python3 main.py --task neurallander --algo-name cac
where all arguments are written all lowercase and -approx behind algorithm name denotes whethere one wants to run it with approximated dynamics (for unknown dynamics assumptions).
After one trains the algorithm, the *.pth file for that algorithm will be generated in the logging folder. One can evaluate the model by moving the model to model/ directory and running the evaluate.py.
One can refer to the robot experiment result here.
We support three logging options—Weights & Biases (WandB), TensorBoard, and CSV—to accommodate different user preferences. Specifically, when WandB is properly configured on your local machine, all algorithmic and parameter settings, along with real-time training metrics, are automatically logged to your WandB dashboard. Simultaneously, training results are saved locally in TensorBoard format for visualization, and evaluation metrics are exported as CSV files for easy analysis. In addition, the model parameters are saved along with the best-performing one.
This project is licensed under the MIT License - see the LICENSE.md file for details
