Skip to content

GilbertHarijanto/FoodVision

Repository files navigation

FoodVision Projects Overview

FoodVision

Welcome to FoodVision – a suite of mobile-friendly food classification projects that leverage state-of-the-art deep learning models for efficient and accurate image recognition. This repository contains three related projects:

  • FoodVision Big: An EfficientNetB2 model trained on 101 food classes from the Food-101 dataset.
  • FoodVision Mini: A lightweight EfficientNetB2 model trained on a 3-class subset (Pizza, Steak, Sushi) optimized for speed and high accuracy.
  • ViT Paper Replicating: A replication of the Vision Transformer (ViT) architecture on a 3-class subset of Food-101, following the original "An Image is Worth 16x16 Words" paper.

Check out our interactive demo on HuggingFace Spaces.

Each project has its own README with detailed instructions for training, evaluation, and deployment. Below you will find links to each project's documentation:


Common Highlights

  • Model Architecture: All projects utilize modern architectures. FoodVision projects use EfficientNetB2 for its optimal balance between speed, size (~30 MB), and accuracy.
  • Performance:
    • FoodVision Mini achieves approximately 96.88% accuracy on a 3-class dataset.
    • FoodVision Big reaches around 65% accuracy on the full Food-101 dataset.
  • Deployment: All models are designed to be lightweight and mobile-friendly, with interactive demos built using Gradio.
  • Requirements:
    • PyTorch
    • TorchVision
    • torchinfo
    • matplotlib
    • pillow
    • tqdm
    • requests
    • gradio

Enjoy exploring and feel free to contribute or deploy these models on your device!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published