Skip to content

biorobotics/eigenbot_RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Eigenbot Reinforcement Learning Repo

This repository documents the Eigenbot team's reinforcement learning (RL) work using legged_gym, a derivative of NVIDIA's Isaac Gym.

It includes environments, robot models, and configurations tailored for training and testing legged robots.

Learning Resources

Repository Structure

This repo, referred to internally as bio_eigen, consists of four main folders:

  • ${\color{green}eigenbot/}$

  • ${\color{green}isaacgym/}$

  • ${\color{green}legged\_gym/}$

    Core training and inference framework.

    📍Most work happens here: ${\color{green}legged\_gym/legged\_gym/env/base/legged\_robot.py/}$

  • ${\color{green}rsl\_rl/}$ RL algorithm implemnetations, including PPO and other on-policy/off-policy methods

${\color{green}legged\_gym/legged\_gym/env/base/legged\_robot.py/}$

  • Defines all core environment functions:

    • ${\color{green}step()}$ -> advances simulation by one step
    • ${\color{green}reset()}$ -> resets environment/robot state
  • Contains all reward functions at the bottom of the file, following the naming pattern:

    def _reward_<reward_name>(self)

    Rewards must be defined in the above format, after definition, the corresponding reward scales must be added into ${\color{green}legged\_robot\_config/}$

    • Example: * ${\color{green}\_reward\_tracking\_lin\_vel()}$

${\color{green}legged\_gym/legged\_gym/env/base/legged\_robot\_config.py/}$

  • Centralized configuration file for:
    • Environment Setup
    • Terrain generation
    • Reward weights
    • Network architecture
    • Command sampling
    • Initial states
    • Control & assets
    • Domain randomization

Terrain Configuration

This module defines how terrains are generated, selected, and managed for the legged robot environments

${\color{green}legged\_gym/legged\_gym/utils/terrain.py}$

Key Parts

  • ${\color{green}Terrain}$ class
    • Initializes the terrain grid for multiple robots ${\color{green}(num\_rows \ x \ num\_cols)}$
    • Supports different terrain generation modes:
      • ${\color{green}randomized\_terrain()}$ -> randomly generates terrain pieces
      • ${\color{green}curriculum()}$ -> terrains incresae in difficulty row by row
      • ${\color{green}seIected\_terrain()}$ -> uses a manually chosen terrain type
    • Stores height maps (${\color{green}height\_field\_raw}$) and origins for each sub-terrain.
  • ${\color{green}make\_terrain(choice,\ difficulty)}$:
    • Builds different terrain types (slopes, stairs, discrete obstacles, stepping stones, gaps, pits, ...) based on proportions + difficulty
  • ${\color{green}add\_terrain\_to\_map()}$:
    • Places generated sub-terrains into the global map
    • Sets each environment's origin (x, y, z)
  • Helper Functions:
    • ${\color{green}gap\_terrain()}$ -> creates a gap in the map
    • ${\color{green}pit\_terrain()}$ -> creates a pit with depth

Running

Argument Type Default Description
--task str "anymal_c_flat" Name of the task/environment. Overrides config file if provided.
--resume flag False Resume training from a checkpoint.
--experiment_name str None Name of the experiment to run or load. Overrides config file.
--run_name str "new" Name of the run (to distinguish runs within the same experiment). Overrides config file.
--expt_id str "00-001" Experiment ID tag (useful for structured naming). Overrides config file.
--load_run str -1 Run directory to load when --resume=True. If -1, loads the last run.
--checkpoint int -1 Model checkpoint to load. If -1, loads the latest checkpoint.
--headless flag False Run simulation without GUI (offscreen/headless mode).
--horovod flag False Enable Horovod for distributed (multi-GPU) training.
--rl_device str "cuda:0" Device used by the RL algorithm (cpu, cuda:0, etc.).
--num_envs int Config default Number of environments to create. Overrides config file.
--seed int Config default Random seed for reproducibility. Overrides config file.
--max_iterations int Config default Maximum number of training iterations. Overrides config file.
--show_heading flag False Visualize robot’s heading direction in the viewer.
--rough_terrain flag False Enable rough terrain (instead of flat ground).
--debug flag False Disable Weights & Biases (wandb) logging (debug mode).
--no_wandb flag False Run without wandb logging entirely.

Example usage:

  • Train eigenbot on flat terrain with 4096 envs:
python train.py --task eigenbot_flat --num_envs 4096 --experiment_name locomotion_flat

Branch Overview

⚠️This main branch is now deprecated, look at the encoder_branch and depth_encoder_branch.

Encoder Branch

  • Introduces encoder modules to process observation states.

  • Includes variants such as:

    1. History Encoder → encodes past states for temporal context.

    2. Privileged Encoder → leverages extra simulation-only information during training.

  • How to follow the flow:

    1. Observation states defined in legged_gym/envs/base/legged_robot.py.

    2. Passed into encoder modules.

    3. Integrated into training via ${\color{green}rsl\_rl}$ (${\color{green}on\_policy\_runner.py}$, ${\color{green}ppo.py}$, ${\color{green}vec\_env.py}$, and ${\color{green}actor\_critic.py}$).

Depth-Encoder Branch

  • Extends the encoder framework by adding a Depth Encoder Module.

  • Depth Encoder:

    • Processes simulated depth maps (from sensors or render).

    • Outputs latent features concatenated with standard encoders.

  • Integrated seamlessly into the PPO pipeline in ${\color{green}rsl\_rl}$.

👉 These two branches build upon the legged_robot observation space and connect into rsl_rl training pipelines, but add new ways of representing or enriching observations.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •