GitHub - atinghosh/NCE: noise contrastive estimation for representation learning

Variants of NCE for unsupervised embedding learning

This repository contains code about implementation of different variants of NCE for representation learning on CIFAR10 and STL10. Below we give description of all type of loss implemented. To render the math equations correctly use mathjax plugin for github from Chrome App Store.

We train a resnet50 model to output a 128 dimensional embedding vector for CIFAR10 with different variants of NCE losses, we get following nearest neighbors based on cosine distance of the embedding vector, of the leftmost image as the query image.

List of Losses

NCE - vanila NCE, with both positive and negative term
sim_CLR - Geoff Hinton's paper without the negative term in NCE, very similar to CPC loss
Spreading_instance_feature - This is taken from this paper, which also includes negative term in NCE loss
NCE_all_grad - We allow gradient through normalization constant, vanila version keeps it fixed
NCE_without_z - we omit the z term in vanila NCE
learnable_tau - we make the temparature learnable parameter, this doesn't work!
no_norm_const - NCE without the normalization constant, we replace the normalization constant by 1.
nm - NCE with negative mining, we do hard negative mining . So for every image in the batch, we include the 50th and 100th nearest neighbor also in the batch. This makes the batch size 3x the original batch size, accordingly we adjust the number of epochs.
no_mixup - We run without the mixup data augmentation.
rand_aug - we use Randaugment, with 2 randomly chosen augmentation from a list of 15 augmentations and intensity of each augmentation is chosen to be 9.
ae_pretrain - We pretrain our feature extractor resnet model (which extracts a feature of dimension 128 for CIFAR10) with an auto encoder (by adding decoder network) for 100 epochs and then start our original training with weights initialized from this pretrained auto encoder.

Two main variants of NCE

Vanila NCE-

where,

For Sim_CLR we do not use the second term or the negative term, and the loss becomes

$f_{\theta}(x)$ is 128 dimensional output from our resnet model for image $x$.

Usage

To train sim_CLR model in (2) we need to execute the following, this will create a folder name cifar_sim_clr in the log folder in the current directory and inside the log folder model and the log.txt will be saved. log.txt will have information about knn accuracy and lp accuracy for every epoch.

python NCE/CIFAR_unsup.py --name cifar_sim_clr --gpu 0,1 --approach sim_CLR

To resume training from the last epoch use

python NCE/CIFAR_unsup.py --resume cifar_sim_clr --gpu 0,1 --approach sim_CLR

Note that we need to use same name we chose in the --name argument for the --resume argument too.

--name : name of the experiment
--resume : resume flag for the experiment, takes the name of the experiment from --name as argument

--approach : name of the loss to be used with one of following choices as described in the list of loss section.

choices=["max_margin",
        "N_pairs_soft_plus",
        "NCE",
        "spreading_instance_feature",
        "NCE_all_grad",
        "NCE_without_z",
        "learnable_tau",
        "sim_CLR",
        "no_norm_const"],

--gpu : gpu to be used for the experiment
--no_mixup : turns off mixup augmentation, by default mixup will be used.
--mixup_uniform : Instead of using weight which is close to 1, we use weight drawn from Uniform(0,1) - Note it doesn't work.
--rand_aug : turn on randaugment, default is False
--nm : turn on negative mining, by default it is False.
--ae_pretrain - turn on auto encoder pretraining, by default it is False
--nb_epoch_ae : if the ae_pretrain flag is True, then how many epochs of autoencoder pretraining need to be run
--ae_lr : learning rate for autoencoder training, default is .001
--dim : dimension of the embedding, default is 128 for CIFAR
--batch-size : default batch size is 512 (batch size needs to be tuned with learning rate)
--lr : default learning rate is .03
--n-epoch : default no. of epoch is 2000
--n-label : no. of labeled images for label propagation for SSL, default is 250
--n-worker : no of worker CPU thread to use for data loader, default is 15
--dataset : "cifar": cifar-10 dataset, "stl": stl-10 dataset. default is cifar
--log-dir : location of log directory relative to current directory, default is log/. The results and model object will be saved in this location inside the name of a folder with same name as passed in the argument of --name.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
NCE		NCE
Unsupervised_Embedding_Learning		Unsupervised_Embedding_Learning
fig		fig
labels		labels
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Variants of NCE for unsupervised embedding learning

List of Losses

Two main variants of NCE

Usage

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Variants of NCE for unsupervised embedding learning

List of Losses

Two main variants of NCE

Usage

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages