Skip to content

Conversation

@Basiljamal1
Copy link
Owner

Real-Time Action Chunking (RTC) — Refactor

This refactor makes Real-Time Action Chunking fully model-agnostic by replacing inheritance with composition + dependency injection.

It splits responsibilities cleanly:

  • RTCPolicy — policy-level wrapper (scheduling, action queue, swapping, delay buffer)
  • RTCPolicyFlowModel — flow-level wrapper (background thread, ΠGDM guided inpainting, queues)
  • Base modelsunchanged (e.g., SmolVLAPolicy as the inference model, VLAFlowMatching as the flow model)

Queues & threading now live only in RTCPolicyFlowModel; neither the base policy nor the base flow model knows anything about RTC.


Why this refactor?

  • Model-agnostic: works with SmolVLA today, and with π0/π0.5 (or others) by swapping the flow model instance.
  • Cleaner layering: policy concerns vs. flow-generation concerns are separated.
  • No monkey-patching: base models remain untouched; RTC composes around them.
  • Backwards-friendly surface: policy.model still exposes the tokenizer and friends via attribute proxying, so code like
    policy.model.vlm_with_expert.processor.tokenizer continues to work.

TL;DR of the architecture

Before (inheritance, cross-coupled)

RTCSmolVLAPolicy (inherits)  SmolVLAPolicy (inference model)
    └── has .model = RTCVLAFlowMatching  (inherits)  VLAFlowMatching (flow model)
         └── background thread & queues lived here

After (composition, DI)

RTCSmolVLAPolicy                 # thin shim the user instantiates
    └── owns RTCPolicy                         # policy wrapper (scheduling, queues for actions)
           ├── inference_model: SmolVLAPolicy  # preprocessing / postprocessing. Stays unchanged
           └── model: RTCPolicyFlowModel       # flow wrapper (ΠGDM + queues + thread)
                    └── flow_model: VLAFlowMatching  # UNCHANGED base flow model (no queues)

Legend:

  • owns = composition (has-a)
  • inherits = subclassing (is-a)

What each piece does

RTCPolicy (policy wrapper)

  • Owns inference model (e.g., SmolVLAPolicy) and RTCPolicyFlowModel.

  • Implements Algorithm 1 policy duties:

    • delay buffer (Q), conservative (d=\max(Q))
    • compute (s=\max(d,s_{\min})), start next inference exactly when (t==s)
    • swap-as-soon-ready; re-index by (\delta = t-s)
    • owns the action queue exposed to the environment
  • Builds the prepared inputs (images/state/lang) using the inference model’s preprocessors.

  • Post-processes chunks using the inference model (unnormalize, optional π-ALOHA).

RTCPolicyFlowModel (flow wrapper)

  • Owns input/output queues and a background thread.

  • Runs ΠGDM guided inpainting (Eqs. 1–5) against the flow model (e.g., VLAFlowMatching):

    • prefix embedding + KV cache
    • velocity(A, τ) calls into the flow model
    • exact soft mask (W) (Eq. 5)
    • guidance via VJP of (f(A)=A+(1-τ)v_\pi)
  • Proxies attributes to the underlying flow model so user code still sees the same public surface (tokenizer, etc.).

Base models (unchanged)

  • Inference model (e.g., SmolVLAPolicy): provides preprocessing, postprocessing, and holds the flow model in .model.
  • Flow model (e.g., VLAFlowMatching): provides embed_prefix, embed_suffix, vlm_with_expert.forward, action_out_proj, sample_actions, sample_noise.

Important: No queues live on the base models anymore.


Directory layout

lerobot/src/lerobot/policies/
├── rtc/
│   ├── __init__.py                 # exports RTCPolicy, RTCPolicyFlowModel
│   ├── model_wrapper.py            # RTCPolicyFlowModel (flow wrapper + thread + queues)
│   └── policy.py                   # RTCPolicy (policy wrapper + scheduling + action queue)
│   └── configuration_rtc_smolvla.py                   #  adds RTCSmolVLACofnig (config)
│   └── rtc_smolvla.py         # RTCSmolVLAPolicy (thin shim; relays to RTCPolicy)
│   └── configuration_pi0.py        # adds PI0ConfigRTC (future)
├── smolvla/
│   ├── modeling_smolvla.py         # UNCHANGED (base flow + smol policy)
│   ├── configuration_smolvla.py    # UNCHANGED
├── pi0/
│   ├── modeling_pi0.py             # UNCHANGED (future)
│   └── configuration_pi0.py        # UNCHANGED (future)
└── factory.py                      # register "rtc_smolvla" -> Registers pi0 in the future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants