Why does the first term in the loss_robust equal F.log_softmax(model(x_adv))? I have tryed to generate adv samples by noraml PGD, and set the loss to the trades's loss. But I cannot understand why it is and I guess I prefer to use criterion_kl(F.softmax(model(x_adv)), F.softmax(model(x_natural))). Are there some nice people who can answer my question? I would appreciate it.