Code and Video of EASI: Evolutionary Adversarial Simulator Identification for Sim-to-Real Transfer

This repository contains the code and video demonstrations of NeurIPS 2024 Conference Submission: EASI: Evolutionary Adversarial Simulator Identification for Sim-to-Real Transfer.

For the sim-to-real experiments, we used the Unitree Go2 as the experimental platform.

The introduction video for the sim-to-real experimental results.

Our page at： OUR PAGE.

Depedence

isaacgym Preview 4 Release, torch 1.13.1, numpy 1.19.0, tensorboard, argparse.

Test

Run python python isaac_gym_env.py, if everything is ok, you can see some robot tasks shown on the screen.

Change env_name in if __name__ == "__main__": in isaac_gym_env.py could change the robotic tasks.

RUN

All steps are listed in scripts.sh. Run those command under ~/your_location/EASI directory.

1. Train base policy using Domain Randomization.

Using train_policy/train_DR_Uniform.py, you can get DR policy and trainging infos in logs/your_env/SAC_DR/seed0-time

args:

env_id: which task to choose. Choose in: 'Ant', 'Cartpole', 'Ballbalance'
num_steps: total env steps during training.
eval_interval: Training performance record interval.
log_mark: A str that labeling this experiment.
seed: random seed

2. Collect state-action transition demonstration in 'real' environment.

Using train_policy/collect_demo.py, you can get demonstration in logs/your_env/demonstration/WD/sizexxx_traj_lengthxxx_real_domain_cpu_seed_x.pth

args:

env_id: which task to choose. Choose in: 'Ant', 'Cartpole', 'Ballbalance'
trajectory_length: Task trajectory length.
collect_steps: Demonstration total steps.
expert_weight: Policy used for sampling state transitions in environment.
seed: random seed

3. Using EASI to identify parameters.

Using Search_gail_Gaussian.py, you can get env param mean and var in logs/your_env/search_gaussian/WDWD/seed_x

args:

env_id: which task to choose. Choose in: 'Ant', 'Cartpole', 'Ballbalance'
tag: A str that labeling this experiment.
expert_data: Demonstration used for Discriminator training.
expert_weight: Policy used for sampling state transitions in environment.
trajectory_length: Task trajectory length.
seed: random seed

4. Train new policy with EASI parameters.

Using train_policy/train_DR_Search.py, you can get EASI policy and trainging infos in logs/your_env/SAC_Search/seed0-time

args:

env_id: which task to choose. Choose in: 'Ant', 'Cartpole', 'Ballbalance'
num_steps: total env steps during training.
eval_interval: Training performance record interval.
log_mark: A str that labeling this experiment.
seed: random seed
search_params_dir: The dir that have EASI parma informations.

5. Evaluate

Using train_policy/evaluate_target_domain.py, you can evaluate policys in target domain.

参数的详细说明在代码里有。

Detailed parameter descriptions are thoroughly introduced in the code.

Code are based on gail-airl-ppo.pytorch

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
EvolutionaryAdversarial		EvolutionaryAdversarial
IsaacGymEnvs		IsaacGymEnvs
example		example
licenses		licenses
pics		pics
train_policy		train_policy
.gitignore		.gitignore
README.md		README.md
Search_gail_Gaussian.py		Search_gail_Gaussian.py
isaac_gym_env.py		isaac_gym_env.py
scripts.sh		scripts.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code and Video of EASI: Evolutionary Adversarial Simulator Identification for Sim-to-Real Transfer

Depedence

Test

RUN

1. Train base policy using Domain Randomization.

2. Collect state-action transition demonstration in 'real' environment.

3. Using EASI to identify parameters.

4. Train new policy with EASI parameters.

5. Evaluate

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Code and Video of EASI: Evolutionary Adversarial Simulator Identification for Sim-to-Real Transfer

Depedence

Test

RUN

1. Train base policy using Domain Randomization.

2. Collect state-action transition demonstration in 'real' environment.

3. Using EASI to identify parameters.

4. Train new policy with EASI parameters.

5. Evaluate

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages