As i read from "the read document of pytorch", the "CrossEntropyLoss() function" is criterion combintion of nn.LogSoftmax() and nn.NLLLoss(). Why there is still a nn.LogSoftmax() layer in the last layer of RawNet() and if i remove the nn.LogSoftmax() there would be Bad results.