Skip to content

PXY2202/Deep-Learning-CS230

 
 

Repository files navigation

For this project, we applied a Vision Transformer model to a multi-label classification task on retinal fundus images. Class imbalance and the small size of the dataset were the biggest roadblocks for this particular problem. As a result, proper data augmentation techniques were key to achieve better performance. We observed that the Vision Transformer was able to outperform our baseline model, ResNet V2.0. Feel free to read the full report for more information.

About

ViT and ResNet models on multi-label classification task

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 100.0%