Hello,
I am following the Dexbotic quick-start tutorial, but I am stuck on part 3 (train on provided simulator data). I am on Ubuntu 20.04; I have an NVIDIA RTX 4000 Ada Generation (1 GPU); and I am using the dexmal/dexbotic Docker image (latest).
Dataset Setup
I downloaded the simulation data from HuggingFace and mounted it into the container. Here is my directory structure:
Docker Command
The dataset and Dexbotic is stored on an external SSD mounted at '/media/juan_rojas/five'
docker run -it --rm --gpus all --network host \ -v /media/juan_rojas/five/dexbotic:/dexbotic \ -v /media/juan_rojas/five/data:/data \ dexmal/dexbotic bash
Launch Training (Libero Example)
I perform training using only 1 GPU.
torchrun --nproc_per_node=1 dexbotic/playground/benchmarks/libero/libero_cogact.py
Error
`Traceback (most recent call last):
File "//dexbotic/playground/benchmarks/libero/libero_cogact.py", line 83, in
exp.train()
File "/app/dexbotic/exp/base_exp.py", line 734, in train
self._initialize_train()
File "/app/dexbotic/exp/base_exp.py", line 660, in _initialize_train
self._auto_compute_norm_stats()
File "/app/dexbotic/exp/base_exp.py", line 717, in _auto_compute_norm_stats
norm_config.compute_norm_stats(self.data_config.dataset_name)
File "/app/dexbotic/exp/base_exp.py", line 476, in compute_norm_stats
dataset_list = self._get_dataset(action_process_func, dataset_name_list)
File "/app/dexbotic/exp/base_exp.py", line 488, in _get_dataset
robot_dataset = DexDataset(
File "/app/dexbotic/data/dataset/dex_dataset.py", line 36, in init
self._build_dataset_from_name(data_args.dataset_name)
File "/app/dexbotic/data/dataset/dex_dataset.py", line 70, in _build_dataset_from_name
dataset = CONVERSATION_DATA[name]
KeyError: 'libero_goal'
[2026-03-20 23:02:29,700] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 5813) of binary: /opt/conda/envs/dexbotic/bin/python3.10
Traceback (most recent call last):
File "/opt/conda/envs/dexbotic/bin/torchrun", line 8, in
sys.exit(main())
File "/opt/conda/envs/dexbotic/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 347, in wrapper
return f(*args, **kwargs)
File "/opt/conda/envs/dexbotic/lib/python3.10/site-packages/torch/distributed/run.py", line 812, in main
run(args)
File "/opt/conda/envs/dexbotic/lib/python3.10/site-packages/torch/distributed/run.py", line 803, in run
elastic_launch(
File "/opt/conda/envs/dexbotic/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 135, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/envs/dexbotic/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 268, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
dexbotic/playground/benchmarks/libero/libero_cogact.py FAILED
Failures:
<NO_OTHER_FAILURES>
Root Cause (first observed failure):
[0]:
time : 2026-03-20_23:02:29
host : baxter
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 5813)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
`
It seems like libero_goal is not found in CONVERSATION_DATA. How do I solve this problem?
Thank you.
Hello,
I am following the Dexbotic quick-start tutorial, but I am stuck on part 3 (train on provided simulator data). I am on Ubuntu 20.04; I have an NVIDIA RTX 4000 Ada Generation (1 GPU); and I am using the dexmal/dexbotic Docker image (latest).
Dataset Setup
I downloaded the simulation data from HuggingFace and mounted it into the container. Here is my directory structure:
Docker Command
The dataset and Dexbotic is stored on an external SSD mounted at '/media/juan_rojas/five'
docker run -it --rm --gpus all --network host \ -v /media/juan_rojas/five/dexbotic:/dexbotic \ -v /media/juan_rojas/five/data:/data \ dexmal/dexbotic bashLaunch Training (Libero Example)
I perform training using only 1 GPU.
torchrun --nproc_per_node=1 dexbotic/playground/benchmarks/libero/libero_cogact.pyError
`Traceback (most recent call last):
File "//dexbotic/playground/benchmarks/libero/libero_cogact.py", line 83, in
exp.train()
File "/app/dexbotic/exp/base_exp.py", line 734, in train
self._initialize_train()
File "/app/dexbotic/exp/base_exp.py", line 660, in _initialize_train
self._auto_compute_norm_stats()
File "/app/dexbotic/exp/base_exp.py", line 717, in _auto_compute_norm_stats
norm_config.compute_norm_stats(self.data_config.dataset_name)
File "/app/dexbotic/exp/base_exp.py", line 476, in compute_norm_stats
dataset_list = self._get_dataset(action_process_func, dataset_name_list)
File "/app/dexbotic/exp/base_exp.py", line 488, in _get_dataset
robot_dataset = DexDataset(
File "/app/dexbotic/data/dataset/dex_dataset.py", line 36, in init
self._build_dataset_from_name(data_args.dataset_name)
File "/app/dexbotic/data/dataset/dex_dataset.py", line 70, in _build_dataset_from_name
dataset = CONVERSATION_DATA[name]
KeyError: 'libero_goal'
[2026-03-20 23:02:29,700] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 5813) of binary: /opt/conda/envs/dexbotic/bin/python3.10
Traceback (most recent call last):
File "/opt/conda/envs/dexbotic/bin/torchrun", line 8, in
sys.exit(main())
File "/opt/conda/envs/dexbotic/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 347, in wrapper
return f(*args, **kwargs)
File "/opt/conda/envs/dexbotic/lib/python3.10/site-packages/torch/distributed/run.py", line 812, in main
run(args)
File "/opt/conda/envs/dexbotic/lib/python3.10/site-packages/torch/distributed/run.py", line 803, in run
elastic_launch(
File "/opt/conda/envs/dexbotic/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 135, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/envs/dexbotic/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 268, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
dexbotic/playground/benchmarks/libero/libero_cogact.py FAILED
Failures:
<NO_OTHER_FAILURES>
Root Cause (first observed failure):
[0]:
time : 2026-03-20_23:02:29
host : baxter
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 5813)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
`
It seems like libero_goal is not found in CONVERSATION_DATA. How do I solve this problem?
Thank you.