BYOL

Title: Bootstrap Your Own Latent : A New Approach to Self-Supervised Learning
Publication: arXiv, 2020
Link: [paper] [code]

online & target networks

online → predict the target network representation of the same image
target → be updated with a slow-moving average of the online

without negative pairs

Method

target network → provide the regression targets to train the online network

ξ [ksi] : exponential moving average of θ

image x → two augmentaed view v, v’
online network : v→ y_θ (= f_θ(v)) → z_θ (=g_θ(y_θ))
target network : v’ → y’_ξ (=f_ξ(v’)) → z’_ξ (=g_ξ(y’_ξ))
output : prediction q_θ(z_θ) of z’_ξ
l2_norm for q_θ(z_θ) & z’_ξ
(v’ → online network / v → target network) ⇒ compute L^~_θ,ξ
Total Loss = L_θ,ξ + L^~_θ,ξ
each training step → stochastic optimization step
end of training → only keep encoder f_θ

Experiments

Deeper & Wider → better

Use the same fixed split of 1% and 10% of ImageNet labeled training data

Batch size & Image augmentations

Contrastive methods → rely on color distortion
BYOL → keep any info captured by target into online to improve predictions
not rely on negative pairs → remain stable (while decreasing the num of Batch size)

Reference

@article{DBLP:journals/corr/abs-2006-07733,
  author    = {Jean-Bastien Grill and
              Florian Strub and
              Florent Altché and
              Corentin Tallec and
              Pierre H. Richemond and
              Elena Buchatskaya and
              Carl Doersch and
              Bernardo Avila Pires and
              Zhaohan Daniel Guo and
              Mohammad Gheshlaghi Azar and
              Bilal Piot and
              Koray Kavukcuoglu and
              Rémi Munos and
              Michal Valko},
  title     = {Bootstrap Your Own Latent : A New Approach to Self-Supervised Learning},
  journal   = {CoRR},
  volume    = {abs/2006.07733},
  year      = {2020},
  url       = {https://arxiv.org/abs/2006.07733},
  eprinttype = {arXiv},
  eprint    = {2006.07733},
  timestamp = {Thu, 10 Sep 2020 09:46:02 +0100},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2006-07733.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BYOL

online & target networks

without negative pairs

Method

Experiments

Deeper & Wider → better

Use the same fixed split of 1% and 10% of ImageNet labeled training data

Batch size & Image augmentations

Reference

FilesExpand file tree

BYOL.md

Latest commit

History

BYOL.md

File metadata and controls

BYOL

online & target networks

without negative pairs

Method

Experiments

Deeper & Wider → better

Use the same fixed split of 1% and 10% of ImageNet labeled training data

Batch size & Image augmentations

Reference