I saw that the ground truth is the entropy of the model prediction across the vocabulary. However, the model prediction confidence distribution will change as the training progresses so it will be like chasing a moving target. What is your opinion on this?
I saw that the ground truth is the entropy of the model prediction across the vocabulary. However, the model prediction confidence distribution will change as the training progresses so it will be like chasing a moving target. What is your opinion on this?