Hi, thanks for your work,
but I'm concerned about why there is no dropout in your model? Dropout improves the performance of consistent loss largely as a data augmentation policy. Is this why the performance of the Pi model is so different from that of the VAT?
Thank you