Question about downstream fine-tuning and convergence of DM0

Hi, thanks for releasing DM0!

I have a question about downstream fine-tuning from DM0-base to specific embodied tasks such as RobotChallenge. In task-specific embodied fine-tuning, is the LLM/VLM part frozen, fully fine-tuned, or adapted with something like LoRA/PEFT? I was not fully sure from the paper what the default setup is for the language backbone in this stage. 

I also noticed that the paper mentions task fine-tuning lengths around 40k–150k steps, and around 200k steps for some mixed-task settings. This feels a bit long to me, so I wanted to ask whether this is the expected convergence range for DM0 in practice, or whether it typically converges faster on downstream tasks. 

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about downstream fine-tuning and convergence of DM0 #77

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about downstream fine-tuning and convergence of DM0 #77

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions