Pretrained Embeddings Collapse

Hello,
I’m trying to reproduce your Supervised Contrastive Learning (SupCon) training on CIFAR100. I understand that your training process involves two stages (to achieve the reported accuracy). I’m particularly interested in the model obtained after the first stage.
My expectation is that, even without training the linear classifier, embeddings of samples from the same class should cluster closely together. I trained the model using your provided code and default CIFAR100 settings from the GitHub README (no modifications).
However, after obtaining the pretrained model, I extracted embeddings for the CIFAR100 dataset and computed the cosine similarity matrix. Surprisingly, the similarity values are almost all close to 1 — even between different classes — which suggests that the embeddings might have collapsed. I also tried to cluster the results through T-SNE. All labels are all over places w/o any cluster or order. However, when I used pretrained ResNet50, it still shows some same-class points close to each other (but still not good performance).
Additionally, when training the linear classifier, the results are not consistent. Despite using the exact same configuration multiple times, I obtained noticeably different accuracies (e.g., 86.7% in one run and 79.85% in another), which differ from the reported results. Is that expected?

Here is the code I used for plotting the similarity matrix:

`def get_embeddings_labels(loader, model, device):
    embeddings = []
    labels = []
    model.eval()
    with torch.inference_mode():
        for imgs, lbls in tqdm(loader, desc="Extracting features"):
            imgs = imgs.to(device, non_blocking=True)
            feats = model(imgs)
            if isinstance(feats, (list, tuple)):
                feats = feats[0]
            embeddings.append(feats.cpu().numpy())
            labels.extend(lbls.numpy().tolist())
    embeddings = np.concatenate(embeddings, axis=0)
    labels = np.array(labels)
    return embeddings, labels

test_embeddings, test_labels = get_embeddings_labels(loader, model, device)
embeddings = {}
for i in range(100):
    embeddings[i] = test_embeddings[test_labels == i]
sim_matrix = np.zeros((100, 100))

for i in range(100):
    sims = embeddings[i] @ embeddings[i].T  # Compute cosine similarity matrix within the class
    np.fill_diagonal(sims, np.nan)  # Ignore diagonal values (self-similarity)
    cur_sim = np.nanmean(sims)  # Calculate the mean similarity excluding diagonal
    sim_matrix[i, i] = cur_sim  # Store the within-class similarity in the matrix

for i in range(100):
    for j in range(100):
        if i == j:
            continue  # Skip if same class (already computed)
        elif i > j:
            continue  # Skip if already computed (matrix symmetry)
        else:
            sims = embeddings[i] @ embeddings[j].T  # Compute cosine similarity between different classes
            cur_sim = np.mean(sims)  # Calculate the mean similarity
            sim_matrix[i, j] = cur_sim  # Store the similarity in the matrix
            sim_matrix[j, i] = cur_sim  # Ensure symmetry in the matrix

plt.figure(figsize=(8, 6))
sns.heatmap(sim_matrix, vmin=0.0, vmax=1.0, annot=True, cmap="YlOrRd", linewidths=0.5)
plt.title("TRrained Network Cosine Similarity Matrix")
plt.show()`

Do I need to normalize anything or do I miss any new update?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pretrained Embeddings Collapse #157

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Pretrained Embeddings Collapse #157

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions