Hey, great work!
I was wondering how your evaluation pipeline looks like in more detail. Did you use the test sets of VFHQ, TalkingHead-1KH or the entire dataset? Also, how did you compute your FID scores? what was the base distribution? Could you give more details about your pipeline?
Thanks!