Skip to content

Why the mAP is lower after distillation in VOC? #88

@cmjkqyl

Description

@cmjkqyl

I use retinanet to train and test on VOC dataset. I found that the accuracy of retinanet after training with FGD is even lower than the original retinanet. It seems that dark knowledge has had a negative impact on the network.
A phenomenon related to this is that when I do not use pre-trained weights to initialize the student network, FGD can have a significant improvement in model accuracy (compared to the original retinanet that also turns off pre-training).
The teacher network is rx101. The student network is r50.
The configuration file and training log are as follows. I have modified train.py and detection_distiller.py according to MGD's file.

Can you give me some suggestions about that? Thanks!!

mmdet==2.18
mmcv-full==1.4.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions