Skip to content

AndrePatri/AugMPC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AugMPC logo

Reinforcement Learning-Augmented Model Predictive Control at scale for legged and hybrid robots. Part of the IBRIDO project.

Architecture

approach overview

  • Hierarchical RL-MPC coupling – The RL agent chooses contact schedules and twist commands for the underlying MPC controllers. A new flight phase is injected, for each limb, when the corresponding actions instantaneously exceed a given thresholds.

  • Sample-efficient learning at scale – AugMPC achieves data-efficient training through high-throughput experience generation, enabled by aggressive MPC parallelization and fully vectorized simulation. On a workstation equipped with an AMD Ryzen Threadripper 7970, 128 GiB RAM, and an NVIDIA RTX 4090, the system sustains 50+× real-time factor while running 800 parallel environments / full rigid body MPC instances at 20 Hz with a ~1 s MPC horizon, even with high dof robots like Centauro (nv=43). Training with Soft Actor–Critic (SAC) and MPCs in the loop typically converges in 1–10 × 10⁶ environment steps (≈ 6 h wall-clock time, corresponding to 9–29 simulated days). This contrasts with > 100 × 10⁶ steps commonly required by blind end-to-end RL locomotion policies.

rewards

  • Domain adaptability – thanks to MPC's robustness, successful sim-to-sim and sim-to-real zero-shot transfer without any domain randomization (no contact properties, inertial, timing randomizations).

  • Robot adaptability – validated on robots with different morphologies and weight distributions (30-120 Kg), with standard legged and hybrid locomotion tasks.

sim-to-sim, sim-to-real

  • Non-Gaited contact scheduling – our architecture is able to generate completely acyclic gaits and timing adaptations:

sim-to-sim, sim-to-real

sim-to-sim, sim-to-real

Software

 software overview

Shared-memory first design: AugMPC relies on a shared memory layer, built on top of EigenIPC for deterministic, real-time-safe communication between simulators, controllers, and learning processes.

AugMPC’s is essentially made of three main components:

  1. World interface – Implements AugMPCWorldInterfaceBase. It connects to Isaac Sim, xbot2, or hardware, publishes robot states, and triggers MPCHive controllers via shared memory. Optional remote stepping lets the training loop decide when the simulator should advance.
  2. MPC cluster – Uses MPCHive's ControlClusterServer/Client to spawn multiple receding-horizon controllers (see aug_mpc.controllers). Each controller reads robot states, solves its MPC problem and writes predictions and commands back to shared memory.
  3. Training environment + RL algorithm – An AugMPCTrainingEnvBase derivative defines the MDP at hand (observations, actions, rewards, terminations, trucations), which is then used by the training executable (SAC is the default, PPO supported).

Specific implentations of world interfaces and training environments are available at AugMPCEnvs.

Repository layout

aug_mpc/
├── training_envs/     # Base classes and wrappers for AugMPCEnvs environments
├── world_interfaces/  # Base world interface that AugMPCEnvs extends
├── controllers/       # LRHC/MPCHive clients and Horizon-based MPC implementations
├── agents/            # Neural network policies (PPO, SAC, dummy)
├── training_algs/     # PPO/SAC trainers, rollout logic, persistence
├── scripts/           # Launchers for clusters, world interfaces, and training loops
└── utils/             # Shared-memory helpers, visualization bridges, teleop, math/utils

Installation

The preferred way to install MPCHive is through ibrido-containers, which ships with all necessary dependencies.

Extending AugMPC

  1. New controller
  2. New world interface
  3. New training environment
  4. New agent/algorithm

About

RL-Augmented Model Predictive Control for Non-Gaited Legged and Hybrid Locomotion

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published