Skip to content

Sounds like a lucky result comes from a wrong formula deduction  #9

@edfall

Description

@edfall

I read the paper carefully, the formula in paper is fundamentally wrong.

  • Under the formula (2) and (3), the probility output has a gaussian distribution. However, the probility can't be a gaussian distribution as it distributed in [0,1] rather than (-infty, +infty).

  • Under the independent assumption(formula (4)) and gaussian distribution mentioned above, the formula (7) is correct. However, if we just look at the first line in formula (7), if independent assumption is established, -log p(y1, y2|f(w,x)) = -log p(y1|f(w,x)) - log(y2|f(w,x)); which is just a sum of cross-entropy loss over different tasks. This is apparently contradicted with the result under additional gaussian assumption.

  • Somehow, the paper repalce the cross entrophy loss with mse which finally reach the result that higher loss task should have higher theta weights. If the paper report is correct, I think the benefit here comes from loss re-balance. Which means, re-balance the task loss will benefit multi-task performance?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions