Skip to content

Irayshon/unitree_rl_mugym

 
 

Repository files navigation

Unitree RL GYM

🌎English | 🇨🇳中文

Reinforcement learning for Unitree robots (Go2, H1, H1_2, G1) with Isaac Gym and Mujoco support.

Now with Google Colab training support - train without a local GPU!

Isaac Gym
Mujoco
Physical

🆕 What's New

🚀 NEW: GPU-Accelerated Training with JAX/MJX

  • 8-10x Faster Training - GPU-accelerated physics with MuJoCo MJX on Colab T4 (FREE!)
  • 🎮 1024+ Parallel Environments - Train with 1024-8192 envs on a single GPU
  • 🔥 JAX-Powered - JIT compilation and vectorization for maximum performance
  • 📊 G1 Robot Ready - Fully ported with phase-based gait control
  • ⏱️ Train in 1.5 hours - Instead of 13+ hours with CPU
  • 🆓 Free on Colab T4 - No local GPU required
  • 📓 Ready-to-use Colab notebook - Start training in one click!

Existing Features: Mujoco-based Training

This fork also includes PyTorch/Mujoco training for Google Colab:

  • ✅ Train on Google Colab free tier (T4 GPU with PyTorch)
  • ✅ No Isaac Gym installation required for training
  • ✅ Visualize trained policies locally with Mujoco
  • ✅ Resume training across multiple Colab sessions
  • PyTorch Colab notebook also available
  • ✅ Compatible with existing deployment tools

📖 Background

Original Framework: Unitree RL Gym

This repository is based on Unitree RL Gym, an excellent framework for training locomotion policies on Unitree robots using Isaac Gym.

Original Features:

  • Isaac Gym-based training (4096 parallel environments on GPU)
  • Supports Go2, H1, H1_2, G1 robots
  • Sim2Real deployment
  • Pre-trained models

Limitations:

  • Requires local GPU with Isaac Gym
  • Isaac Gym doesn't work on Google Colab
  • Isaac Gym is legacy (superseded by Isaac Lab)

Our Modification: Unitree RL MuGym

We've extended the framework with Mujoco-based training environments that:

  • Run on Google Colab (no local GPU needed)
  • Use Mujoco physics instead of Isaac Gym
  • Support CPU training (slower but accessible)
  • Maintain compatibility with the original deployment pipeline

What's New:

  • 🚀 JAX/MJX GPU Training:
    • MJXLeggedRobot - GPU-accelerated JAX/MJX base environment (10-100x faster)
    • MJXG1Robot - G1 with phase-based gait, fully vectorized for GPU
    • train_jax_ppo.py - Brax PPO training with 2048-8192 parallel environments
    • Environment registry for easy robot instantiation
    • Complete documentation and test suite
  • PyTorch/CPU Training:
    • MujocoLeggedRobot - Base Mujoco environment class with rsl_rl compatibility
    • MujocoG1Robot - G1-specific Mujoco implementation with phase-based gait
    • train_mujoco.py - Colab-compatible training script with robust config handling
    • train_g1_mujoco_colab.ipynb - Complete Colab notebook with step-by-step instructions
    • ObservationDict - Custom dict class supporting .to(device) for rsl_rl
    • Optional Isaac Gym imports - Framework works without Isaac Gym installed
    • Fallback math functions - Pure PyTorch implementations of Isaac Gym utilities

Technical Improvements:

  • ✅ XML model loading (URDF → Mujoco XML with proper actuators)
  • ✅ Dictionary-based observations (policy/critic groups for rsl_rl)
  • ✅ Hybrid config structure (both nested and flat for rsl_rl compatibility)
  • ✅ Robust DOF detection (actuators → joints → config validation)
  • ✅ PD control mapping (automatic gain assignment from config)
  • ✅ Phase-based rewards (encouraging natural bipedal gait)

What's Preserved:

  • Original Isaac Gym environments (still work if you have local GPU)
  • Deployment scripts (Sim2Sim, Sim2Real)
  • Configuration system and hyperparameters
  • Reward functions and observation spaces
  • Pre-trained models and checkpoint formats

🚀 Quick Start

⚡ Option 1: GPU-Accelerated Training with JAX/MJX on Colab (Fastest - NEW!)

8-10x faster than CPU training - FREE on Google Colab T4!

Open In Colab

  1. Open Notebook: Click badge above or upload notebooks/train_g1_jax_colab.ipynb to Colab
  2. Enable T4 GPU: Runtime > Change runtime type > T4 GPU
  3. Run All Cells: Training completes in ~1.5 hours (vs 13+ hours with CPU!)
  4. Download Model: Checkpoints saved every 50 iterations

Quick Links:

Benefits:

  • Train in 1.5 hours instead of 13 hours (on free T4 GPU)
  • 1024 parallel environments (vs 256 with PyTorch/CPU)
  • Full G1 humanoid support with phase-based gait
  • No local GPU required - completely cloud-based
  • Compatible with existing deployment tools

Performance on Colab T4:

  • 1000 iterations: ~1.5 hours (vs ~13 hours PyTorch/CPU)
  • Speedup: 8.7x faster
  • Memory usage: ~12GB (fits comfortably on T4)

🐢 Option 2: Train on Google Colab (CPU/GPU, No JAX required)

  1. Open Colab Notebook

    • Upload notebooks/train_g1_mujoco_colab.ipynb to Google Colab
    • Or open directly: Open In Colab
  2. Select GPU Runtime

    • Go to Runtime > Change runtime type > Hardware accelerator > GPU (T4)
    • Free tier provides ~13 hours of training time
  3. Run All Cells

    • Training takes ~13 hours for 10,000 iterations (can stop earlier)
    • Checkpoints saved every 500 iterations
    • TensorBoard available for real-time monitoring
    • Models automatically downloadable
  4. Visualize Locally

    # Install on local machine
    pip install mujoco==3.2.3 torch pyyaml
    
    # Run visualization
    python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml \
        --policy /path/to/downloaded/model.pt

💻 Option 3: Train Locally with PyTorch/CPU (If You Have CPU/GPU)

# Clone repository
git clone https://github.com/julienokumu/unitree_rl_mugym.git
cd unitree_rl_mugym

# Install dependencies (Mujoco-only, no Isaac Gym)
pip install mujoco==3.2.3 scipy pyyaml tensorboard rsl-rl-lib torch
pip install -e .

# Train policy
python legged_gym/scripts/train_mujoco.py \
    --task g1_mujoco \
    --num_envs 256 \
    --max_iterations 10000 \
    --device cpu  # or 'cuda' if you have GPU

# Visualize
python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml

📚 Documentation


🤖 Supported Robots

Robot Mujoco Support Isaac Gym Support DOF Type
G1 ✅ Yes ✅ Yes 12 Humanoid
H1 🚧 Coming Soon ✅ Yes 12 Humanoid
H1_2 🚧 Coming Soon ✅ Yes 12 Humanoid
Go2 🚧 Coming Soon ✅ Yes 12 Quadruped

Currently, only G1 has JAX/MJX support. Other robots can be added by following the implementation pattern.


📦 Installation

Option 1: Google Colab Training (No Local GPU Required)

  1. Open the Colab Notebook: train_g1_mujoco_colab.ipynb
  2. Enable GPU: Runtime > Change runtime type > GPU (T4)
  3. Run all cells - training takes ~2 hours per 50 iterations
  4. Download trained policy and visualize locally

Local visualization only (no training):

pip install mujoco==3.2.3 torch pyyaml numpy
python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml --policy /path/to/downloaded/policy_1.pt

Option 2: Local Training (Isaac Gym)

For full Isaac Gym setup, see setup.md.


🔁 Workflow

TrainPlaySim2SimSim2Real

  • Train: Train policy in simulation (Isaac Gym or Mujoco)
  • Play: Visualize and verify trained policy
  • Sim2Sim: Test policy in different simulators
  • Sim2Real: Deploy to physical robot

🛠️ Usage

1. Training

Mujoco Training (Colab/Local)

python legged_gym/scripts/train_mujoco.py --task=g1_mujoco

Parameters:

  • --device: cuda or cpu
  • --num_envs: Number of parallel environments (default: 512)
  • --max_iterations: Training iterations (default: 1000)
  • --resume: Resume from latest checkpoint

Training tips:

  • 50-100 iterations: Basic coordination emerges (~2-4 hours on T4)
  • 500 iterations: Stable walking policy (~20 hours, use resume)
  • 1000+ iterations: Robust, efficient walking

Isaac Gym Training (Local Only)

python legged_gym/scripts/train.py --task=g1

Parameters:

  • --task: Robot type (go2, g1, h1, h1_2)
  • --headless: Run without GUI (faster)
  • --resume: Resume training
  • --experiment_name, --run_name: Organize experiments
  • --num_envs: Parallel environments
  • --max_iterations: Training iterations

Models saved to: logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt


2. Play (Visualization in Gym)

Visualize training results:

python legged_gym/scripts/play.py --task=g1

Exports policy to logs/{experiment_name}/exported/policies/policy_1.pt for deployment.


3. Sim2Sim (Mujoco)

Test policy in Mujoco simulator:

python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml

Custom policy:

python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml --policy /path/to/policy_1.pt

Configuration: Edit deploy/deploy_mujoco/configs/g1.yaml to customize policy path, control parameters, etc.

Mujoco Simulation Results

G1 H1 H1_2
mujoco_g1 mujoco_h1 mujoco_h1_2

4. Sim2Real (Physical Robot)

Deploy trained policy to physical robot (requires robot in debug mode):

python deploy/deploy_real/deploy_real.py {net_interface} {config_name}

Parameters:

  • net_interface: Network interface name (e.g., eth0, enp3s0)
  • config_name: Config file (e.g., g1.yaml, h1.yaml)

See Physical Deployment Guide for details.

Deployment Results

G1 H1 H1_2
real_g1 real_h1 real_h1_2

🎉 Acknowledgments

Built upon these excellent open-source projects:


🔖 License

This project is licensed under the BSD 3-Clause License.

For details, please read the full LICENSE file.

About

unitree rl gym for mujoco on colab

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 86.0%
  • Jupyter Notebook 9.4%
  • C++ 4.1%
  • Other 0.5%