Skip to content

Question about batch size #53

@lxysl

Description

@lxysl

Hi, I am impressed and interested in your outstanding work. However, I noticed a question about the batch size while experimenting with the code. When I increase the batch size to 128 or 256, the test accuracy decreases. I am unsure why this algorithm is affected by a larger batch size. Theoretically, the effect of co-guess, co-refine, and mixmatch operators is not related to the batch size. I have two possible reasons for this:

  1. the learning rate is ought to increase proportionally with batch size
  2. I traced back to the MixMatch implementation code, and found that there are slight differences with the MixMatch in DivideMix. Between line 253 to line 262, the original code do the interleave operation, but it is not implemented in DivideMix. The author of MixMatch also explain the meaning of this operation in this issue, which seems to be related to the BatchNorm.

Could you please give some explanations or insights on this question? I will also do some extra experiments on this topic, and will share the results soon. Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions