KD-ViT2CNN

The code in this repository demonstrates knowledge distillation from a Vision Transformer (ViT) to a Convolutional Neural Network (CNN).

Background

Knowledge distillation is a technique that transfer the learnings of a large pretrained model ("teacher model") to a smaller model ("student model").

In this repository, the distillation is from a ViT (i.e. DINOv3) to a CNN, that involve heterogenous feature distillation. Additional feature projector is needed to align the CNN's spatial feature map. with the ViT's token-based features.

Distillation Dataset

The selection of distillation dataset involving balancing of generalization and task relevance.

Since the student network is also pretrained with large dataset (e.g. ImageNet), the intuition is it already has powerful generalization capability as well.

The distillation dataset will comprises of small amount of general data and larger portion of task-specific data.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
kd_vit2cnn		kd_vit2cnn
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
distillation.py		distillation.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

KD-ViT2CNN

Background

Distillation Dataset

About

Uh oh!

Sponsor this project

Uh oh!

Languages

Uh oh!

License

kengboon/KD-ViT2CNN

Folders and files

Latest commit

History

Repository files navigation

KD-ViT2CNN

Background

Distillation Dataset

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Sponsor this project

Uh oh!

Languages