Flickr8k Dataset

The Flickr8k dataset is a benchmark dataset widely used for research in image captioning and multimodal deep learning. This repository provides both the image and text annotation components of the dataset, ready for use in computer vision and natural language processing projects.

Images: 8,092 JPEG photos in a single folder (Flicker8k_Dataset/)
Size: ~1GB
Source: Selected from six different Flickr groups, designed to capture a broad variety of everyday scenes and situations
Content:
- No famous people or locations
- Diverse, realistic scenarios
Structure: All images (train, val, test) are kept together—not pre-split into sets

2. Flickr8k_text.zip

Size: ~2.2MB
Components:
- Flickr8k.token.txt: Image IDs paired with 5 human-written captions per image
- Flickr_8k.trainImages.txt: Filenames for training images
- Flickr_8k.testImages.txt: Filenames for test images
- Flickr_8k.devImages.txt: Filenames for validation (dev) images
- ExpertAnnotations.txt: Identifies captions written by expert annotators vs. crowdsourced workers
- Additional files: Various metadata and annotation resources

How to Use

Download and extract the dataset with the commands below:

# Download archives from the repository releases
print("Downloading Flickr8K dataset...")
!wget -q https://github.com/Avaneesh40585/Flickr8k-Dataset/releases/download/v1.0/Flickr8k_Dataset.zip  # Images
!wget -q https://github.com/Avaneesh40585/Flickr8k-Dataset/releases/download/v1.0/Flickr8k_text.zip     # Text annotations

# Extract to current directory
print("Extracting dataset...")
!unzip -qq Flickr8k_Dataset.zip
!unzip -qq Flickr8k_text.zip

# Clean up to save disk space
print("Cleaning up zip files...")
!rm Flickr8k_Dataset.zip Flickr8k_text.zip

print("Dataset setup complete!")

After extraction, you'll find the images in the Flicker8k_Dataset/ folder and all text files in your working directory. Use the split files to separate images for training, validation, and testing.

Attribution

Original dataset and descriptions: University of Illinois at Urbana-Champaign, created for image captioning research.
Please ensure to cite the original Flickr8k paper when publishing work using this dataset.

License

Check the original dataset documentation for terms of use and licensing conditions. Typically, usage is permitted for academic and non-commercial research.

Dataset Structure Example

├── Flicker8k_Dataset/         # JPEG images (8,092 total)
│   ├── 1000268201_693b08cb0e.jpg
│   ├── ...
├── Flickr8k.token.txt         # Main caption file
├── Flickr_8k.trainImages.txt  # Training split
├── Flickr_8k.testImages.txt   # Test split
├── Flickr_8k.devImages.txt    # Validation split
├── ExpertAnnotations.txt      # Expert/crowd annotation info

Happy researching!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flickr8k Dataset

Contents

1. Flickr8k_Dataset.zip

2. Flickr8k_text.zip

How to Use

Attribution

License

Dataset Structure Example

About

Uh oh!

Releases 1

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Flickr8k Dataset

Contents

1. Flickr8k_Dataset.zip

2. Flickr8k_text.zip

How to Use

Attribution

License

Dataset Structure Example

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!