RuntimeError: Magnitude of gradient is bad: inf


[train.conllu.txt](https://github.com/UppsalaNLP/uuparser/files/6458823/train.conllu.txt)
[tune.conllu.txt](https://github.com/UppsalaNLP/uuparser/files/6458825/tune.conllu.txt)

While tuning some of the parameters across multiple treebanks (1296 total runs), I got this error for about 60 of the runs:

```
  File "uuparser/parser.py", line 325, in main
    run(experiment,options)
  File "uuparser/parser.py", line 51, in run
    parser.Train(traindata,options)
  File "/data/rob/datasplits/uuparser/uuparser/arc_hybrid.py", line 448, in Train
    self.trainer.update()
  File "_dynet.pyx", line 6198, in _dynet.Trainer.update
  File "_dynet.pyx", line 6203, in _dynet.Trainer.update
RuntimeError: Magnitude of gradient is bad: inf
```

After running the exact same commands one more time, I still have the error for 24 of these 60 runs. I could not find a clear trend on why/when this happens, and it happened on two different machines. 

One example where this occurs is the following command:
python3 uuparser/parser.py --trainfile ../ --devfile ../newsplits-v2.7/UD_Arabic-PADT/tune.conllu --learning-rate 0.01 --word-emb-size 100 --char-emb-size 500 --no-bilstms 1 --outdir  models/UD_Arabic-PADT.0.0.01.100.500.1 

I have attached the used files, I get the error at epoch 11.

PS. This can be solved by continuing training after the crash using --continue (which I have now done for my own experiments)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Magnitude of gradient is bad: inf #28

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RuntimeError: Magnitude of gradient is bad: inf #28

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions