Skip to content

NaN GradNorm. Exiting #1

@chrshdl

Description

@chrshdl

In my attempt to train a model using a GTX1070 8GB I am getting an error "NaN GradNorm. Exiting". I was using a batch size of 128. The training set consists of olivia.zip and the librispeech corpus of clean-360.

Previsouly I was experimenting with batch sizes between 256 and 512 and I was getting CUDA_OUT_OF_MEMORY errors. I couldn't eliminate those errors, configuring the Tensorflow session to only use 90% of the available GDRAM didn't bring any remedy.

        sess_conf.gpu_options.allow_growth = True
        sess_conf.gpu_options.per_process_gpu_memory_fraction = 0.9

Do you have any hints how to overcome this error?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions