SimCLRv2

Title: Big Self-Supervised models are Strong Semi-Supervised Learners
Publication: NeurIPS, 2020
Link: [paper] [code]

Main findings & Contribution

for semi-supervised learning via the task-agnostic use of unlabeled data
- the fewer the labels, the more benefit from a bigger model
with the task-specific use of unlabeled data, the predictive performance improve and transfer into a smaller network
deeper projection head
- improve semi-supervised performance when fine-tuning from a middle layer of the projection head

Method

unlabeled data is used in a task-agnostic way

for general representation via unsupervised pretraining
general representations are adapted for a specific task via supervised fine-tuning

unlabeled data is used in a task-specific way

for improving predictive performance & obtaining a compact model

train Student networks on the unlabeled data with imputed labels from the fine-tuned Teacher network Summarize : pretrain → fine-tune → distill

Experiments

Bigger Models Are More Label-Efficient

increase width & depth, using SK → improve performance

bigger models are more label-efficient
gains → larger for semi-supervised learning

Bigger/Deeper Projection Heads Improve Representation Learning

deeper projection head during pretraining is better
fine-tuning from the first layer is better than fine-tuning from the input (0th layer)
bigger ResNets, improvements from having a deeper projection head are smaller

Distillation Using Unlabeled Data Improves Semi-Supervised Learning

Student model has Smaller, Same architecture with Teacher model → distillation improve model efficiency

Reference

@article{DBLP:journals/corr/abs-2006-10029,
  author    = {Ting Chen and
               Simon Kornblith and
               Mohammad Norouzi and
               Geoffrey E. Hinton},
  title     = {Big Self-Supervised models are Strong Semi-Supervised Learners},
  journal   = {CoRR},
  volume    = {abs/2006.10029},
  year      = {2020},
  url       = {https://arxiv.org/abs/2006.10029v2},
  eprinttype = {arXiv},
  eprint    = {2006.10029v2},
  timestamp = {Mon, 26 Oct 2020 03:09:28 +0100},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2006-10029.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SimCLRv2

Main findings & Contribution

Method

Experiments

Bigger Models Are More Label-Efficient

Bigger/Deeper Projection Heads Improve Representation Learning

Distillation Using Unlabeled Data Improves Semi-Supervised Learning

Reference

FilesExpand file tree

SimCLRv2.md

Latest commit

History

SimCLRv2.md

File metadata and controls

SimCLRv2

Main findings & Contribution

Method

Experiments

Bigger Models Are More Label-Efficient

Bigger/Deeper Projection Heads Improve Representation Learning

Distillation Using Unlabeled Data Improves Semi-Supervised Learning

Reference