ImMimic: Cross-Domain Imitation from Human Videos via Mapping and Interpolation

Yangcen Liu* · Woo Chul Shin* · Yunhai Han · Zhenyang Chen ·
Harish Ravichandar · Danfei Xu
Georgia Institute of Technology · CoRL 2025 (Oral)
(* equal contribution)

Paper · Project Page · Code · Data · Video

Overview

ImMimic is an embodiment-agnostic co-training framework that leverages abundant human videos and a small amount of teleoperated robot demonstrations. It bridges domain gaps via (1) retargeted human hand trajectories as action supervision, (2) DTW mapping (action- or visual-based), and (3) MixUp interpolation in latent/action space to create intermediate domains for adaptation.

Pipeline:

Collect robot demonstrations (teleoperation).
Extract human actions from videos and retarget to the robot action space.
Map human ↔ robot trajectories with DTW (action-based or visual-based).
MixUp interpolate paired trajectories in latent and action space.
Co-train diffusion policy on robot demos + interpolated human data.

Setup

Create the conda env

conda create -n immimic python=3.10
conda activate immimic

Install MuJoCo

pip install "mujoco==3.3.0"

Install PyTorch

pip install torch==2.6.0 torchvision==0.21.0

Install robosuite v1.5.1

git clone https://github.com/ARISE-Initiative/robosuite.git
cd robosuite
git checkout v1.5.1 
pip install -e . --no-deps
cd ..

Install robomimic

cd robomimic
pip install -e .
cd ..

Install requirements.txt

pip install -r requirements.txt
pip install diffusers==0.36.0
pip install huggingface-hub==0.30.2

Dataset Format

Group: data
└── Group: demo_t
    ├── Dataset: action_absolute         (T, A)
    │   # [0:3]   eef_position (x, y, z)
    │   # [3:6]   eef_orientation_axis_angle (rx, ry, rz)
    │   # [6:A]   hand DOF (gripper-dependent)
    │
    └── Group: obs
        ├── Dataset: agentview_image     (T, H=180, W=320, 3)
        ├── Dataset: eef_pose_w_gripper  (T, P)
        │   # [0:3]   eef_position (x, y, z)
        │   # [3:7]   eef_orientation_quaternion (w, x, y, z)
        │   # [7:P]   hand DOF
        └── Dataset: wrist_image         (T, H=180, W=320, 3)
Group: mask
├── Dataset: train  (N_train,) # demo keys
└── Dataset: valid  (N_valid,) # demo keys

# Hand DOF convention:
#   Ability Hand : 6 DOF
#   Allegro Hand : 16 DOF
#   Fin Ray      : 1 DOF
#   Robotiq      : 1 DOF
#
# Therefore:
#   A = 6 + hand_dof
#   P = 7 + hand_dof

Each demo_t is a variable-length trajectory of length T.
Sample dataset can be found here. Put dataset at robomimic/robomimic/datasets

Data Processing

Compute action stats

python robomimic/robomimic/datasets/utils/compute_action_stats.py --hdf5_path robomimic/robomimic/datasets/umi_pick_place_0_5.hdf5 --output_path robomimic/robomimic/datasets/umi_pick_place_action_stats.json

Map human-robot data using DTW (to be updated)

Training

python -u robomimic/robomimic/scripts/train.py --config robomimic/robomimic/exps/configs/umi_pick_place_100_5.json

Rollout

Install gello_software to run rollout code.

python robomimic/robomimic/scripts/run_trained_policy.py --ckpt_path /robomimic/robomimic/exps/policy_trained_models/umi_pick_place/test/20260104222851/models/model_epoch_300.pth --norm_path robomimic/robomimic/datasets/umi_pick_place_action_stats.json

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
robomimic		robomimic
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ImMimic: Cross-Domain Imitation from Human Videos via Mapping and Interpolation

Overview

Setup

Dataset Format

Data Processing

Training

Rollout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ImMimic: Cross-Domain Imitation from Human Videos via Mapping and Interpolation

Overview

Setup

Dataset Format

Data Processing

Training

Rollout

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages