-
Notifications
You must be signed in to change notification settings - Fork 46
Description
Hi,
I am trying to use the trained models given by the authors. However, I discovered that the detection output locations are wrong and in many cases outside the image range!
To further investigate about this, I used the detection code in examples , and compared the results from the original SSD implementation of Wei and the new SSD model with ResNet101 introduced here. I tested the same image proposed in the examples (examples/images/fish-bike.jpg).
With the old SSD code with VGG model I get right results as follows:
[0.028087676, 0.23656183, 0.88743579, 0.95228869, 2, 0.82035357, u'bicycle']
[0.42205709, 0.026113272, 0.70970505, 0.51584023, 15, 0.99626094, u'person']
But with the new SSD model in this repository I get:
[3.6806207, 3.2651825, 3.9601259, 3.7642479, 1, 0.99327165, u'person']
[1.065011, 0.64031565, 1.3344085, 1.142953, 1, 0.9836536, u'person']
[255.99077, 256.33829, 256.99084, 256.97342, 2, 0.7926603, u'bicycle']
[14.477366, 14.709455, 15.462966, 15.438332, 2, 0.69000566, u'bicycle']
The first 4 numbers are the normalized detection locations [x_min, y_min, x_max, y_max], so after multiplying by the image width and height (481 and 323 in this case), I should get the bbox locations inside the tested image. This is the case with the original SSD models, as I get the right locations:
[14, 76, 427, 308]
[203, 8, 341, 167]
but with the new SSD-ResNet model introduced here, I get the locations:
[1770, 1055, 1905, 1216]
[512, 207, 642, 369]
[123132, 82797, 123613, 83002]
[6964, 4751, 7438, 4987]
which introduce bboxes outside the image! It is also obvious problem since the normalized positions should not exceed 1.0. Note that the same problem happens when using the DSSD model.
Thank you.