Skip to content

RuntimeError: Magnitude of gradient is bad: inf #28

@robvanderg

Description

@robvanderg

train.conllu.txt
tune.conllu.txt

While tuning some of the parameters across multiple treebanks (1296 total runs), I got this error for about 60 of the runs:

  File "uuparser/parser.py", line 325, in main
    run(experiment,options)
  File "uuparser/parser.py", line 51, in run
    parser.Train(traindata,options)
  File "/data/rob/datasplits/uuparser/uuparser/arc_hybrid.py", line 448, in Train
    self.trainer.update()
  File "_dynet.pyx", line 6198, in _dynet.Trainer.update
  File "_dynet.pyx", line 6203, in _dynet.Trainer.update
RuntimeError: Magnitude of gradient is bad: inf

After running the exact same commands one more time, I still have the error for 24 of these 60 runs. I could not find a clear trend on why/when this happens, and it happened on two different machines.

One example where this occurs is the following command:
python3 uuparser/parser.py --trainfile ../ --devfile ../newsplits-v2.7/UD_Arabic-PADT/tune.conllu --learning-rate 0.01 --word-emb-size 100 --char-emb-size 500 --no-bilstms 1 --outdir models/UD_Arabic-PADT.0.0.01.100.500.1

I have attached the used files, I get the error at epoch 11.

PS. This can be solved by continuing training after the crash using --continue (which I have now done for my own experiments)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions