Unitree RL GYM

Reinforcement learning for Unitree robots (Go2, H1, H1_2, G1) with Isaac Gym and Mujoco support.

✨ Now with Google Colab training support - train without a local GPU! ✨

Isaac Gym	Mujoco	Physical

🆕 What's New

🚀 NEW: GPU-Accelerated Training with JAX/MJX

⚡ 8-10x Faster Training - GPU-accelerated physics with MuJoCo MJX on Colab T4 (FREE!)
🎮 1024+ Parallel Environments - Train with 1024-8192 envs on a single GPU
🔥 JAX-Powered - JIT compilation and vectorization for maximum performance
📊 G1 Robot Ready - Fully ported with phase-based gait control
⏱️ Train in 1.5 hours - Instead of 13+ hours with CPU
🆓 Free on Colab T4 - No local GPU required
📓 Ready-to-use Colab notebook - Start training in one click!

Existing Features: Mujoco-based Training

This fork also includes PyTorch/Mujoco training for Google Colab:

✅ Train on Google Colab free tier (T4 GPU with PyTorch)
✅ No Isaac Gym installation required for training
✅ Visualize trained policies locally with Mujoco
✅ Resume training across multiple Colab sessions
✅ PyTorch Colab notebook also available
✅ Compatible with existing deployment tools

📖 Background

Original Framework: Unitree RL Gym

This repository is based on Unitree RL Gym, an excellent framework for training locomotion policies on Unitree robots using Isaac Gym.

Original Features:

Isaac Gym-based training (4096 parallel environments on GPU)
Supports Go2, H1, H1_2, G1 robots
Sim2Real deployment
Pre-trained models

Limitations:

Requires local GPU with Isaac Gym
Isaac Gym doesn't work on Google Colab
Isaac Gym is legacy (superseded by Isaac Lab)

Our Modification: Unitree RL MuGym

We've extended the framework with Mujoco-based training environments that:

Run on Google Colab (no local GPU needed)
Use Mujoco physics instead of Isaac Gym
Support CPU training (slower but accessible)
Maintain compatibility with the original deployment pipeline

What's New:

🚀 JAX/MJX GPU Training:
- MJXLeggedRobot - GPU-accelerated JAX/MJX base environment (10-100x faster)
- MJXG1Robot - G1 with phase-based gait, fully vectorized for GPU
- train_jax_ppo.py - Brax PPO training with 2048-8192 parallel environments
- Environment registry for easy robot instantiation
- Complete documentation and test suite
PyTorch/CPU Training:
- MujocoLeggedRobot - Base Mujoco environment class with rsl_rl compatibility
- MujocoG1Robot - G1-specific Mujoco implementation with phase-based gait
- train_mujoco.py - Colab-compatible training script with robust config handling
- train_g1_mujoco_colab.ipynb - Complete Colab notebook with step-by-step instructions
- ObservationDict - Custom dict class supporting .to(device) for rsl_rl
- Optional Isaac Gym imports - Framework works without Isaac Gym installed
- Fallback math functions - Pure PyTorch implementations of Isaac Gym utilities

Technical Improvements:

✅ XML model loading (URDF → Mujoco XML with proper actuators)
✅ Dictionary-based observations (policy/critic groups for rsl_rl)
✅ Hybrid config structure (both nested and flat for rsl_rl compatibility)
✅ Robust DOF detection (actuators → joints → config validation)
✅ PD control mapping (automatic gain assignment from config)
✅ Phase-based rewards (encouraging natural bipedal gait)

What's Preserved:

Original Isaac Gym environments (still work if you have local GPU)
Deployment scripts (Sim2Sim, Sim2Real)
Configuration system and hyperparameters
Reward functions and observation spaces
Pre-trained models and checkpoint formats

🚀 Quick Start

⚡ Option 1: GPU-Accelerated Training with JAX/MJX on Colab (Fastest - NEW!)

8-10x faster than CPU training - FREE on Google Colab T4!

Open Notebook: Click badge above or upload notebooks/train_g1_jax_colab.ipynb to Colab
Enable T4 GPU: Runtime > Change runtime type > T4 GPU
Run All Cells: Training completes in ~1.5 hours (vs 13+ hours with CPU!)
Download Model: Checkpoints saved every 50 iterations

Quick Links:

Benefits:

Train in 1.5 hours instead of 13 hours (on free T4 GPU)
1024 parallel environments (vs 256 with PyTorch/CPU)
Full G1 humanoid support with phase-based gait
No local GPU required - completely cloud-based
Compatible with existing deployment tools

Performance on Colab T4:

1000 iterations: ~1.5 hours (vs ~13 hours PyTorch/CPU)
Speedup: 8.7x faster
Memory usage: ~12GB (fits comfortably on T4)

🐢 Option 2: Train on Google Colab (CPU/GPU, No JAX required)

Open Colab Notebook
- Upload notebooks/train_g1_mujoco_colab.ipynb to Google Colab
- Or open directly:
Select GPU Runtime
- Go to Runtime > Change runtime type > Hardware accelerator > GPU (T4)
- Free tier provides ~13 hours of training time
Run All Cells
- Training takes ~13 hours for 10,000 iterations (can stop earlier)
- Checkpoints saved every 500 iterations
- TensorBoard available for real-time monitoring
- Models automatically downloadable

Visualize Locally

# Install on local machine
pip install mujoco==3.2.3 torch pyyaml

# Run visualization
python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml \
    --policy /path/to/downloaded/model.pt

💻 Option 3: Train Locally with PyTorch/CPU (If You Have CPU/GPU)

# Clone repository
git clone https://github.com/julienokumu/unitree_rl_mugym.git
cd unitree_rl_mugym

# Install dependencies (Mujoco-only, no Isaac Gym)
pip install mujoco==3.2.3 scipy pyyaml tensorboard rsl-rl-lib torch
pip install -e .

# Train policy
python legged_gym/scripts/train_mujoco.py \
    --task g1_mujoco \
    --num_envs 256 \
    --max_iterations 10000 \
    --device cpu  # or 'cuda' if you have GPU

# Visualize
python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml

📚 Documentation

COLAB_TRAINING.md - Complete guide for Colab training workflow
README_ORIGINAL.md - Original Unitree RL Gym documentation
Notebooks - Jupyter notebooks for training

🤖 Supported Robots

Robot	Mujoco Support	Isaac Gym Support	DOF	Type
G1	✅ Yes	✅ Yes	12	Humanoid
H1	🚧 Coming Soon	✅ Yes	12	Humanoid
H1_2	🚧 Coming Soon	✅ Yes	12	Humanoid
Go2	🚧 Coming Soon	✅ Yes	12	Quadruped

Currently, only G1 has JAX/MJX support. Other robots can be added by following the implementation pattern.

📦 Installation

Option 1: Google Colab Training (No Local GPU Required)

Open the Colab Notebook: train_g1_mujoco_colab.ipynb
Enable GPU: Runtime > Change runtime type > GPU (T4)
Run all cells - training takes ~2 hours per 50 iterations
Download trained policy and visualize locally

Local visualization only (no training):

pip install mujoco==3.2.3 torch pyyaml numpy
python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml --policy /path/to/downloaded/policy_1.pt

Option 2: Local Training (Isaac Gym)

For full Isaac Gym setup, see setup.md.

🔁 Workflow

Train → Play → Sim2Sim → Sim2Real

Train: Train policy in simulation (Isaac Gym or Mujoco)
Play: Visualize and verify trained policy
Sim2Sim: Test policy in different simulators
Sim2Real: Deploy to physical robot

🛠️ Usage

1. Training

Mujoco Training (Colab/Local)

python legged_gym/scripts/train_mujoco.py --task=g1_mujoco

Parameters:

--device: cuda or cpu
--num_envs: Number of parallel environments (default: 512)
--max_iterations: Training iterations (default: 1000)
--resume: Resume from latest checkpoint

Training tips:

50-100 iterations: Basic coordination emerges (~2-4 hours on T4)
500 iterations: Stable walking policy (~20 hours, use resume)
1000+ iterations: Robust, efficient walking

Isaac Gym Training (Local Only)

python legged_gym/scripts/train.py --task=g1

Parameters:

--task: Robot type (go2, g1, h1, h1_2)
--headless: Run without GUI (faster)
--resume: Resume training
--experiment_name, --run_name: Organize experiments
--num_envs: Parallel environments
--max_iterations: Training iterations

Models saved to: logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt

2. Play (Visualization in Gym)

Visualize training results:

python legged_gym/scripts/play.py --task=g1

Exports policy to logs/{experiment_name}/exported/policies/policy_1.pt for deployment.

3. Sim2Sim (Mujoco)

Test policy in Mujoco simulator:

python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml

Custom policy:

python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml --policy /path/to/policy_1.pt

Configuration: Edit deploy/deploy_mujoco/configs/g1.yaml to customize policy path, control parameters, etc.

Mujoco Simulation Results

G1	H1	H1_2

4. Sim2Real (Physical Robot)

Deploy trained policy to physical robot (requires robot in debug mode):

python deploy/deploy_real/deploy_real.py {net_interface} {config_name}

Parameters:

net_interface: Network interface name (e.g., eth0, enp3s0)
config_name: Config file (e.g., g1.yaml, h1.yaml)

See Physical Deployment Guide for details.

Deployment Results

G1	H1	H1_2

🎉 Acknowledgments

Built upon these excellent open-source projects:

legged_gym: Training framework
rsl_rl: RL algorithms
mujoco: Physics simulation
unitree_sdk2_python: Hardware interface
unitree_rl_gym: Original repository

🔖 License

This project is licensed under the BSD 3-Clause License.

For details, please read the full LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
deploy		deploy
doc		doc
docs		docs
legged_gym		legged_gym
notebooks		notebooks
resources/robots		resources/robots
.gitignore		.gitignore
LICENSE		LICENSE
QUICKSTART_JAX.md		QUICKSTART_JAX.md
README.md		README.md
README_zh.md		README_zh.md
convert_checkpoint_to_jit.py		convert_checkpoint_to_jit.py
setup.py		setup.py
test_jax_setup.py		test_jax_setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unitree RL GYM

🆕 What's New

🚀 NEW: GPU-Accelerated Training with JAX/MJX

Existing Features: Mujoco-based Training

📖 Background

Original Framework: Unitree RL Gym

Our Modification: Unitree RL MuGym

🚀 Quick Start

⚡ Option 1: GPU-Accelerated Training with JAX/MJX on Colab (Fastest - NEW!)

🐢 Option 2: Train on Google Colab (CPU/GPU, No JAX required)

💻 Option 3: Train Locally with PyTorch/CPU (If You Have CPU/GPU)

📚 Documentation

🤖 Supported Robots

📦 Installation

Option 1: Google Colab Training (No Local GPU Required)

Option 2: Local Training (Isaac Gym)

🔁 Workflow

🛠️ Usage

1. Training

Mujoco Training (Colab/Local)

Isaac Gym Training (Local Only)

2. Play (Visualization in Gym)

3. Sim2Sim (Mujoco)

Mujoco Simulation Results

4. Sim2Real (Physical Robot)

Deployment Results

🎉 Acknowledgments

🔖 License

About

Uh oh!

Releases

Packages

Languages

License

Irayshon/unitree_rl_mugym

Folders and files

Latest commit

History

Repository files navigation

Unitree RL GYM

🆕 What's New

🚀 NEW: GPU-Accelerated Training with JAX/MJX

Existing Features: Mujoco-based Training

📖 Background

Original Framework: Unitree RL Gym

Our Modification: Unitree RL MuGym

🚀 Quick Start

⚡ Option 1: GPU-Accelerated Training with JAX/MJX on Colab (Fastest - NEW!)

🐢 Option 2: Train on Google Colab (CPU/GPU, No JAX required)

💻 Option 3: Train Locally with PyTorch/CPU (If You Have CPU/GPU)

📚 Documentation

🤖 Supported Robots

📦 Installation

Option 1: Google Colab Training (No Local GPU Required)

Option 2: Local Training (Isaac Gym)

🔁 Workflow

🛠️ Usage

1. Training

Mujoco Training (Colab/Local)

Isaac Gym Training (Local Only)

2. Play (Visualization in Gym)

3. Sim2Sim (Mujoco)

Mujoco Simulation Results

4. Sim2Real (Physical Robot)

Deployment Results

🎉 Acknowledgments

🔖 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages