DrAC

Official repository for NeurIPS 2025 publication "Learning Intractable Multimodal Policies with Reparameterization and Diversity Regularization"

This repository implements our proposed algorithm, diversity regularized actor critic (DrAC), which learns multimodal actors with diversity regularization.

Specifically, this repository contains implementations of amortized actor and diffusion actor. These kinds of actors are can express complex multimodal decision distributions, which is critical in some domains. Our method overcomes the intractibility challenge of such actors, elicites diverse multimodal behaviors efficiently and effectively.

Subfigure 1 and 3 visualize the behavior of amortized actor and diffusion actor trained by our algorithm.

DrAmort (DrAC with amortized actor) also exhibits competitive performance in standard mujoco benchmarks.

Installation

Run pip install -r requirements.txt (Recommended python version: 3.12)

Run

Command to run our algorithm follows this format

python train.py {algo} --task {task} --options

{algo} and {task} must be specified.

{algo} should be one from "SAC", "DrAC", "DACER".
{task} shound be one from "MultiGoalPointMaze", "MarioLevelGen", and all mujoco environments.
By specifying --options, default configurations or hyperparameter defined by rl/config.yaml in the same name will be rewrite.

For example, if you want to run our algorithm with a customized temperature ($\beta$ in the paper) at 0.6 in the hard map of MultiGoalPointMaze. The default maze map of the MultiGoalPointMaze environment is maze_map=simple, defined in line 38 of rl/config.yaml The default temperature setting in MultiGoalPointMaze is beta=0.8, defiend in line 41 of rl/config.yaml. Then, the complete command you need to run is

python train.py --task MultiGoalPointMaze --beta 0.6 --maze_map hard

Citation

bibtex:

@inproceedings{wanglearning,
  title={Learning Intractable Multimodal Policies with Reparameterization and Diversity Regularization},
  author={Wang, Ziqi and Liu, Jiashun and Pan, Ling},
  booktitle={Annual Conference on Neural Information Processing Systems},
  year={2025}
}

Text:

Wang, Ziqi, Jiashun Liu, and Ling Pan. "Learning Intractable Multimodal Policies with Reparameterization and Diversity Regularization." Annual Conference on Neural Information Processing Systems. 2025.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
analysis		analysis
assets		assets
envs		envs
myutils		myutils
rl		rl
smb		smb
src/gan		src/gan
stac		stac
.gitignore		.gitignore
README.md		README.md
calc_smbgen_hv.py		calc_smbgen_hv.py
make_figs.py		make_figs.py
requirements.txt		requirements.txt
root.py		root.py
train.py		train.py
train_svgd.py		train_svgd.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DrAC

Installation

Run

Citation

About

Uh oh!

Releases

Packages

Languages

PneuC/DrAC

Folders and files

Latest commit

History

Repository files navigation

DrAC

Installation

Run

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages