Hi, thanks for your repository!
I noticed that in the training script you provided, it loads the pretrained model EVAC EnerV_AC_deepspeed_v0.1.pt. The training starts with a loss of around 0.06, and the output appears to align well with the ground truth, which seems correct.
However, when I comment out the pretrained_checkpoint in the YAML file and attempt to train from scratch, the training begins with a loss of approximately 0.3, which then decreases to around 0.06 as well. Despite this, the log output video still appears to random noise.

Could you provide any insights or suggestions on how to address this issue? Thank you!