GitHub - gauthiii/fineTunedBLIP: Fine Tuned the model BLIP to accurately caption images of Tom and Jerry.

Image Captioning using Hugging Face transformers

Developed an image captioning pipeline leveraging the Hugging Face Transformers library to generate descriptive captions for images using the BLIP (Bootstrapped Language-Image Pretraining) model. Implemented preprocessing workflows with Pillow for image conversion and tokenized inputs using AutoProcessor for compatibility with the model. Leveraged Python libraries such as transformers, Pillow, and PyTorch to design a scalable, automated solution for image-to-text generation. Gained hands-on experience in transformer-based vision-language models and practical applications in image captioning.

Initial Caption Generated: the image of a cat and a dog

Caption Generated after Fine-Tuning: the image of tom and jerry from tom and jerry show

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
fine.ipynb		fine.ipynb
img.py		img.py
tuned.py		tuned.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Captioning using Hugging Face transformers

About

Uh oh!

Releases

Packages

Languages

gauthiii/fineTunedBLIP

Folders and files

Latest commit

History

Repository files navigation

Image Captioning using Hugging Face transformers

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages