-
Notifications
You must be signed in to change notification settings - Fork 83
Description
Hi there,
You work is great! Here I have some questions about the Knowledge Distillation process on CIFAR10 dataset in you experiment part.
-
How many CIFAR10-like images have you generated in order to reach those accuracies is Table 1 in your paper? As we have tried with 3000 or 10000 generated images (with DI,Resnet34,alpha_f = 10) using vanilla KD to distill from Resnet34 to Resnet18 and only reached 25% or 55% validation acc.
-
We encountered problems when trying ADI.
In the description of Table 1, it's said "for ADI, we generate one new batch of images every 50 KD iterations and merge the newly generated images into the existing set of generated iamges". Could you please explain more about this? Does the 50KD iteration mean 50 KD epochs? Does the "one new batch of images" mean a batch of like 256 images and merge them into the exisiting generated dataset? Does the KD process have to hang up and wait for the "new-batch-generating" process every 50 KD iteration (epoch if I get it correctly)?
Thanks