-
Notifications
You must be signed in to change notification settings - Fork 174
Closed
Description
Hi, i tried to train model with only LJ data, and with only own data, with fp16 and with fr32, with 1 gpu and with 3 gpu, but everywhere i have this

Always los is Nan.
When i start with pretrained chekpoint your code return this:

I solve it by changing def load_checkpoint , but loss is nan(
do u have any ideas what am i doing wrong?
Metadata
Metadata
Assignees
Labels
No labels