Skip to content

About dropout step in the paper #6

@luzai

Description

@luzai

Thank you very much for sharing the whole implementation! I am curious about dropout step in this figure, may I ask some questions?
image

  • As shown in the code here, I guess that dropout on indicator is applied during the warmup steps (6255 steps), the keep probability (rather than drop probability) is 0.6; after 6255 steps, the keep probability is 100, and the model will choose 3x3, 5x5, exp=3, exp=6 by learned threshold. May I ask whether my interpretation correct?
  • May I ask why the runtime term is larger in the dropout steps? And why the runtime decrease rapidly and precisely to 79 ms? What whould the curve be like if hyperparameter lambda=0.02 is setted to other values?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions