Thanks for your work.Refer to the accuracy you gave in the pre-trained model on Flickr30K.GSMN-dense:rsum: 481.4,GSMN-sparse:
rsum: 476.8; in the paper,GSMN-dense:rsum: 483.6,GSMN-sparse:rsum: 480.1. Have two models been used in evaluation when the paper got the dense or sparse like SCAN