I'd like to ask why it doesn't converge when I run it with the resnet18 model when reproducing this code?