This repository contains the code for the paper Realtime-VLA V2: Learning to Run VLAs Fast, Smooth, and Accurate, and provides a deployment stack for real-world dual-arm manipulation with fast, smooth, and accurate execution.
In deployment of VLA models to real-world robotic tasks, execution speed matters. Beyond fast GPU inference, this project focuses on the remaining bottlenecks in the full deployment stack, including calibration, action execution, control, and learning-based speed selection. The end-to-end result is that on real-world tasks requiring both dexterity and accuracy, the robot can execute about 3x faster than a standard baseline, reaching casual human speed while staying close to the robot hardware limit.
The repository contains:
server/: remote inference service with Pi05 JAX and Pi05 Triton backends, together with time-axis action planningclient/: local runtime stack including robot and camera I/O, observer / actuator bindings, executor implementations, aligned logging, asynchronous video recording, and YAML-based task switching- modular builder entrypoints in server/builders.py and client/builders.py, which make it easy to extend the codebase with custom model backends, robots, observers, actuators, executors, and task configurations
The Triton backend is built on top of dexmal/realtime-vla and extends it with realtime chunking / action prefill style usage from Training-Time Action Conditioning for Efficient Real-Time Chunking.
The table below lists task demos and runtime logs.
| Task | Demo Video | RRD Log |
|---|---|---|
| Cloth Folding | Demo | RRD |
| Chip Placement | Demo | RRD |
| Box Placement | Demo | RRD |
The commands below assume you are in the repository root.
conda create -n realtime-vla-v2 python=3.10 -y
conda activate realtime-vla-v2
python -m pip install --upgrade pip
pip install -r requirements.txtNotes:
- The server is intended to run on an NVIDIA GPU machine compatible with your
torchandtritoninstallation. - The repository provides a
mockconfiguration for running through the end-to-end code path without real robot hardware. - For real robot deployment,
airbot_realcorresponds to the AIRBOT W1 SDK. - If you use a different robot stack, you can extend your own robot configuration by adding new implementations and registering them in client/builders.py.
All runtime parameters are configured in YAML.
Choose one matching server config and one matching client config for the same task.
Server:
python server/infer_server.py --config server/config_cloth.yamlClient:
python client/local_client.py --config client/config_cloth.yamlServer:
python server/infer_server.py --config server/config_chip.yamlClient:
python client/local_client.py --config client/config_chip.yamlServer:
python server/infer_server.py --config server/config_box.yamlClient:
python client/local_client.py --config client/config_box.yamlClient:
python client/local_client.py --config client/config_mock.yamlThe client saves runtime outputs to the directory specified by visualization.output_dir in the selected YAML.
Recording includes:
- aligned trajectory logs in
jsonl - asynchronous multi-camera video writing
- in
rrd,actual_actiondenotes the delay-aligned measured robot state - for MPC tasks,
raw_pre_mpc_actiondenotes the direct model output,pre_mpc_actiondenotes the time-parameterized trajectory before local MPC, andpost_mpc_actiondenotes the locally optimized command that is actually sent to the robot - for smooth / raw-action tasks,
raw_pre_smooth_actiondenotes the direct model output,pre_smooth_actiondenotes the time-parameterized trajectory before local smoothing, andpost_smooth_actiondenotes the locally smoothed / tracked command that is actually sent to the robot - inference-complete markers are overlaid on
pre_mpc_actionorpre_smooth_actionto show inference timing on the trajectory
If you want, you can cite this work with:
@article{yang2026realtimevlav2,
title={Realtime-VLA V2: Learning to Run VLAs Fast, Smooth, and Accurate},
author={Yang, Chen and Hu, Yucheng and Ma, Yunchao and Yang, Yunhuan and Tan, Jing and Fan, Haoqiang},
journal={arXiv preprint arXiv:2603.26360},
year={2026}
}