-
Notifications
You must be signed in to change notification settings - Fork 143
Open
Description
Hi,
This is a very interesting and impressive paper. After reading this paper multiple times, there is still one question that stuck in my mind. U-Net has been adapted for many diffusion tasks. Is there any reason that this work chose to use a transformer instead of a U-Net? If we use the U-Net as the backbone, does using x-prediction plus v-loss also improve performance? Or, does U-Net need another combination of prediction and loss?
Any insight into this is really helpful
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels