Issue with resuming training

Hi team - a question about the following line of code:
https://github.com/CUMLSec/trex/blob/b4b7b972e662307a1437d71b0cffad3eed782be6/fairseq/tasks/trex.py#L144

Can I ask what function this serves, and why? If resuming training from a checkpoint model where the epoch count exceeds 28, this is greater than 1 and causes assertion fails. I note that if starting training from epoch 0 with no interruption, the mask_prob remains fixed at the initialized value (0.2), but with any interruption the training resumes with a higher mask prob.
Should this just be `mask_prob = self.args.mask_prob` ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with resuming training #31

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue with resuming training #31

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions