Hello! Very great work. I have some problems with the reproduction process.
- is it normal to have a large number of nan values for loss values when you first start training.
- could you please tell me approximately how long an epoch takes when you train audio-visual information at the same time.