🌎English | 🇨🇳中文
Reinforcement learning for Unitree robots (Go2, H1, H1_2, G1) with Isaac Gym and Mujoco support.
✨ Now with Google Colab training support - train without a local GPU! ✨
- ⚡ 8-10x Faster Training - GPU-accelerated physics with MuJoCo MJX on Colab T4 (FREE!)
- 🎮 1024+ Parallel Environments - Train with 1024-8192 envs on a single GPU
- 🔥 JAX-Powered - JIT compilation and vectorization for maximum performance
- 📊 G1 Robot Ready - Fully ported with phase-based gait control
- ⏱️ Train in 1.5 hours - Instead of 13+ hours with CPU
- 🆓 Free on Colab T4 - No local GPU required
- 📓 Ready-to-use Colab notebook - Start training in one click!
This fork also includes PyTorch/Mujoco training for Google Colab:
- ✅ Train on Google Colab free tier (T4 GPU with PyTorch)
- ✅ No Isaac Gym installation required for training
- ✅ Visualize trained policies locally with Mujoco
- ✅ Resume training across multiple Colab sessions
- ✅ PyTorch Colab notebook also available
- ✅ Compatible with existing deployment tools
This repository is based on Unitree RL Gym, an excellent framework for training locomotion policies on Unitree robots using Isaac Gym.
Original Features:
- Isaac Gym-based training (4096 parallel environments on GPU)
- Supports Go2, H1, H1_2, G1 robots
- Sim2Real deployment
- Pre-trained models
Limitations:
- Requires local GPU with Isaac Gym
- Isaac Gym doesn't work on Google Colab
- Isaac Gym is legacy (superseded by Isaac Lab)
We've extended the framework with Mujoco-based training environments that:
- Run on Google Colab (no local GPU needed)
- Use Mujoco physics instead of Isaac Gym
- Support CPU training (slower but accessible)
- Maintain compatibility with the original deployment pipeline
What's New:
- 🚀 JAX/MJX GPU Training:
MJXLeggedRobot- GPU-accelerated JAX/MJX base environment (10-100x faster)MJXG1Robot- G1 with phase-based gait, fully vectorized for GPUtrain_jax_ppo.py- Brax PPO training with 2048-8192 parallel environments- Environment registry for easy robot instantiation
- Complete documentation and test suite
- PyTorch/CPU Training:
MujocoLeggedRobot- Base Mujoco environment class with rsl_rl compatibilityMujocoG1Robot- G1-specific Mujoco implementation with phase-based gaittrain_mujoco.py- Colab-compatible training script with robust config handlingtrain_g1_mujoco_colab.ipynb- Complete Colab notebook with step-by-step instructionsObservationDict- Custom dict class supporting.to(device)for rsl_rl- Optional Isaac Gym imports - Framework works without Isaac Gym installed
- Fallback math functions - Pure PyTorch implementations of Isaac Gym utilities
Technical Improvements:
- ✅ XML model loading (URDF → Mujoco XML with proper actuators)
- ✅ Dictionary-based observations (policy/critic groups for rsl_rl)
- ✅ Hybrid config structure (both nested and flat for rsl_rl compatibility)
- ✅ Robust DOF detection (actuators → joints → config validation)
- ✅ PD control mapping (automatic gain assignment from config)
- ✅ Phase-based rewards (encouraging natural bipedal gait)
What's Preserved:
- Original Isaac Gym environments (still work if you have local GPU)
- Deployment scripts (Sim2Sim, Sim2Real)
- Configuration system and hyperparameters
- Reward functions and observation spaces
- Pre-trained models and checkpoint formats
8-10x faster than CPU training - FREE on Google Colab T4!
- Open Notebook: Click badge above or upload
notebooks/train_g1_jax_colab.ipynbto Colab - Enable T4 GPU: Runtime > Change runtime type > T4 GPU
- Run All Cells: Training completes in ~1.5 hours (vs 13+ hours with CPU!)
- Download Model: Checkpoints saved every 50 iterations
Quick Links:
- 📓 Colab Notebook - Start here!
- 📖 Full JAX/MJX Documentation
- 🚀 Quickstart Guide
Benefits:
- Train in 1.5 hours instead of 13 hours (on free T4 GPU)
- 1024 parallel environments (vs 256 with PyTorch/CPU)
- Full G1 humanoid support with phase-based gait
- No local GPU required - completely cloud-based
- Compatible with existing deployment tools
Performance on Colab T4:
- 1000 iterations: ~1.5 hours (vs ~13 hours PyTorch/CPU)
- Speedup: 8.7x faster
- Memory usage: ~12GB (fits comfortably on T4)
-
Open Colab Notebook
- Upload
notebooks/train_g1_mujoco_colab.ipynbto Google Colab - Or open directly:
- Upload
-
Select GPU Runtime
- Go to
Runtime > Change runtime type > Hardware accelerator > GPU (T4) - Free tier provides ~13 hours of training time
- Go to
-
Run All Cells
- Training takes ~13 hours for 10,000 iterations (can stop earlier)
- Checkpoints saved every 500 iterations
- TensorBoard available for real-time monitoring
- Models automatically downloadable
-
Visualize Locally
# Install on local machine pip install mujoco==3.2.3 torch pyyaml # Run visualization python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml \ --policy /path/to/downloaded/model.pt
# Clone repository
git clone https://github.com/julienokumu/unitree_rl_mugym.git
cd unitree_rl_mugym
# Install dependencies (Mujoco-only, no Isaac Gym)
pip install mujoco==3.2.3 scipy pyyaml tensorboard rsl-rl-lib torch
pip install -e .
# Train policy
python legged_gym/scripts/train_mujoco.py \
--task g1_mujoco \
--num_envs 256 \
--max_iterations 10000 \
--device cpu # or 'cuda' if you have GPU
# Visualize
python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml- COLAB_TRAINING.md - Complete guide for Colab training workflow
- README_ORIGINAL.md - Original Unitree RL Gym documentation
- Notebooks - Jupyter notebooks for training
| Robot | Mujoco Support | Isaac Gym Support | DOF | Type |
|---|---|---|---|---|
| G1 | ✅ Yes | ✅ Yes | 12 | Humanoid |
| H1 | 🚧 Coming Soon | ✅ Yes | 12 | Humanoid |
| H1_2 | 🚧 Coming Soon | ✅ Yes | 12 | Humanoid |
| Go2 | 🚧 Coming Soon | ✅ Yes | 12 | Quadruped |
Currently, only G1 has JAX/MJX support. Other robots can be added by following the implementation pattern.
- Open the Colab Notebook: train_g1_mujoco_colab.ipynb
- Enable GPU:
Runtime > Change runtime type > GPU (T4) - Run all cells - training takes ~2 hours per 50 iterations
- Download trained policy and visualize locally
Local visualization only (no training):
pip install mujoco==3.2.3 torch pyyaml numpy
python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml --policy /path/to/downloaded/policy_1.ptFor full Isaac Gym setup, see setup.md.
Train → Play → Sim2Sim → Sim2Real
- Train: Train policy in simulation (Isaac Gym or Mujoco)
- Play: Visualize and verify trained policy
- Sim2Sim: Test policy in different simulators
- Sim2Real: Deploy to physical robot
python legged_gym/scripts/train_mujoco.py --task=g1_mujocoParameters:
--device:cudaorcpu--num_envs: Number of parallel environments (default: 512)--max_iterations: Training iterations (default: 1000)--resume: Resume from latest checkpoint
Training tips:
- 50-100 iterations: Basic coordination emerges (~2-4 hours on T4)
- 500 iterations: Stable walking policy (~20 hours, use resume)
- 1000+ iterations: Robust, efficient walking
python legged_gym/scripts/train.py --task=g1Parameters:
--task: Robot type (go2, g1, h1, h1_2)--headless: Run without GUI (faster)--resume: Resume training--experiment_name,--run_name: Organize experiments--num_envs: Parallel environments--max_iterations: Training iterations
Models saved to: logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt
Visualize training results:
python legged_gym/scripts/play.py --task=g1Exports policy to logs/{experiment_name}/exported/policies/policy_1.pt for deployment.
Test policy in Mujoco simulator:
python deploy/deploy_mujoco/deploy_mujoco.py g1.yamlCustom policy:
python deploy/deploy_mujoco/deploy_mujoco.py g1.yaml --policy /path/to/policy_1.ptConfiguration: Edit deploy/deploy_mujoco/configs/g1.yaml to customize policy path, control parameters, etc.
| G1 | H1 | H1_2 |
|---|---|---|
Deploy trained policy to physical robot (requires robot in debug mode):
python deploy/deploy_real/deploy_real.py {net_interface} {config_name}Parameters:
net_interface: Network interface name (e.g.,eth0,enp3s0)config_name: Config file (e.g.,g1.yaml,h1.yaml)
See Physical Deployment Guide for details.
| G1 | H1 | H1_2 |
|---|---|---|
Built upon these excellent open-source projects:
- legged_gym: Training framework
- rsl_rl: RL algorithms
- mujoco: Physics simulation
- unitree_sdk2_python: Hardware interface
- unitree_rl_gym: Original repository
This project is licensed under the BSD 3-Clause License.
For details, please read the full LICENSE file.