-
Notifications
You must be signed in to change notification settings - Fork 95
Description
Q: (1). In Table 4 of your paper, the last row (TFusion-sup) shows rank-1 accuracy is 73.13%. And my question is:
Your paper adopts DLCE as the supervised learning algorithm, and DLCE achieves rank-1 accuracy 79.51%. Can I say your method degrades performance of supervised learning method, or your method is more suitable for cross datasets scenario? It would be great if you give more details about this.
(Z. Zheng, L. Zheng, and Y. Yang. A discriminatively learned cnn embedding for person re-identification. TOMM, 2017)
A: We implement DLCE in Keras and can't reach 79.51% as they reported, only 75%. Even if we use their MATLAB source code, we can only reach 77% rank-1 accuracy.
In Table4, TFusion-sup rank-1 accuracy is 73.13% because when the vision classifier is very strong, much more powerful than the spatial-temporal model, fusion model will be a little weaker than the vision classifier.
Therefore, our method is more suitable when visual classifier is weak, including cross dataset scenario and some visual-hard scenario like GRID.