Hello researchers,
This is a very interesting work! However, I have a question regarding the number of training epochs. Why are such a large number of epochs (e.g., 60k or 80k) set in the example? Even with an A800 GPU, this is very time-consuming.
I would like to know:
1.Why was this specific number of epochs chosen?
2.Does the number of epochs have a significant impact on model performance?
Thank you for your attention to this issue!