Skip to content

gauthiii/fineTunedBLIP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image Captioning using Hugging Face transformers

Developed an image captioning pipeline leveraging the Hugging Face Transformers library to generate descriptive captions for images using the BLIP (Bootstrapped Language-Image Pretraining) model. Implemented preprocessing workflows with Pillow for image conversion and tokenized inputs using AutoProcessor for compatibility with the model. Leveraged Python libraries such as transformers, Pillow, and PyTorch to design a scalable, automated solution for image-to-text generation. Gained hands-on experience in transformer-based vision-language models and practical applications in image captioning.

Initial Caption Generated: the image of a cat and a dog

Caption Generated after Fine-Tuning: the image of tom and jerry from tom and jerry show

About

Fine Tuned the model BLIP to accurately caption images of Tom and Jerry.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published