Thanks for your code and paper.
I notice that there is no validation set in the training stage. and the training process is stopeed when the loss converges. I am curious how to define the "converge" and avoid overfitting, since the loss may fluctuates.