Hello, I found a phenomenon during the process of testing traditional KD methods on the CIFAR-100 dataset in the experiment with Res56-20. In my multiple experiments, I found that the final specific effect has reached 71.54, far exceeding the data in your experiments and the results in existing research papers. My configuration is Python 3.8 running on A800. I would like to ask what possible reasons could lead to this.