I am trying to reproduce RACNN network performance with pytorch. But there isn't details about how to train a APN network. without pretraining, rank loss doesn't decrease. i am wonder this code work on CUB to decrease the APN net loss.