-
Notifications
You must be signed in to change notification settings - Fork 204
Open
Description
Thanks for the great research and code sharing.
After reading the paper and using it in my research, I got a question.
There are two styles for the implementation of weighted loss.
Case 1) L = w_a * L_a + w_b * L_b + w_c * L_c
Case 2) L = L_a + w_b * L_b + w_c * L_c
In case 2, the weight of a loss L_a is set to 1. In my humble opinion, I guess that w_b and w_c will be learned with relative log_vars values accordingly.
In your paper or code, on the other hand, all weights, i.e., all log_vars are set to learnable as in Case 1.
Is there any intention to prefer Case 1? Could it be a problem if I use the style of Case 2?
Metadata
Metadata
Assignees
Labels
No labels