-
Notifications
You must be signed in to change notification settings - Fork 77
Open
Description
In my attempt to train a model using a GTX1070 8GB I am getting an error "NaN GradNorm. Exiting". I was using a batch size of 128. The training set consists of olivia.zip and the librispeech corpus of clean-360.
Previsouly I was experimenting with batch sizes between 256 and 512 and I was getting CUDA_OUT_OF_MEMORY errors. I couldn't eliminate those errors, configuring the Tensorflow session to only use 90% of the available GDRAM didn't bring any remedy.
sess_conf.gpu_options.allow_growth = True
sess_conf.gpu_options.per_process_gpu_memory_fraction = 0.9
Do you have any hints how to overcome this error?
Metadata
Metadata
Assignees
Labels
No labels