currently b200 dgcx gcp cluster stores ckpts on the lustre shared cluster level storage instead of local node level /raid/ NVMe leading to 1-2 hour loads for kimi k2.5
switching to /raid/ will lead to 6-7x more job completions throughput per hour for b200