GitHub - njustkmg/MM25-GLP

Introduction

This is a PyTorch implementation for paper "Noise Self-Correction via Relation Propagation for Robust Cross-Modal Retrieval". Our method GLP is built on top of the CLIP in PyTorch for end-to-end Image-text Matching.

Requirements

python 3.8
cuda 11.7
torch 2.0.1+cu117

pip install requirments.txt

Data Preparation

Split Dataset

We conducted experiments on three datasets: MSCOCO, Flickr30K, and CC120K. We followed SCAN to split image-text pairs in MSCOCO and FLickr30K into training, validation and testing sets.

MSCOCO. We unified the images' name format of the MSCOCO dataset for easier use. You can use dataset/MSCOCO_rename.py to rename the images in MSCOCO. (MSCOCO_2014)
Flickr30K. (Flickr30K).
CC95K. We tested the proposed method on the real-world dataset Conceptual Captions. Since the full dataset is too large, we randomly selected a subset of Conceptual Captions, named CC95K, including 95,656 images for training, 1,000 for validation, and 1,000 for testing. The image data will be upload soon.

Construct Noisy Datasets

We constructed noise by randomly shuffling some captions of the images.

You can obtain your noisy dataset using construct_noise.py.

Since there are around 3%-20% incorrect annotations existing in the real-world dataset Conceptual Captions, we did not create noisy samples manually.

Note!

If you want to use your own noisy dataset for training, the Memory Bank should also be rebuilt. You can construct the noise dataset by construct_noise.py.

Download Link

The final data directory tree should be:

├── dataset/
├── ${DATASET_NAME}/
|    ├── annotations/
|    |   ├── memory_bank/
|    |   |   ├── ${noise_ratio}_mbank_img_idx.npy
|    |   |   ├── ${noise_ratio}_mbank_txt_idx.npy
|    |   |   └── ...
|    |   └──scan_split/
|    |       ├── ${noise_ratio}_noise_train_caps.txt #samples use for training. ${noise_ration} is in {0, 0.2, 0.4, 0.6}
|    |       ├── train_caps.txt # the same as `0_noise_train_caps.txt`
|    |       ├── train_ids.txt 
|    |       ├── dev_caps.txt #samples use for validation
|    |       ├── dev_ids.txt 
|    |       ├── test_caps.txt #samples use for testing
|    |       ├── test_ids.txt 
|    |       └── ...
|    └── images/ # all images in MSCOCO (or Flickr30K, CC120K)
└── ...

Training

For training GLP You can train a new model via the following command. Before training, you can read the params.py carefully to check your parameter setting. The --num_anns should be set to 5 for MSCOCO and Flickr30K, and 1 for CC95K.

For GLP without queue feature enhancement:

python main_GLP.py --batch_size 256 --epochs 5 --lr 5e-7 --vision_model ViT-B/32 --noise_ratio ${NOISE RATIO} --num_anns ${5 or 1} --dataset_root ${YOUR PATH} --dataset coco --checkpoint_path ${YOUR PATH}

For GLP with queue feature enhancement:

python main_GLP_queue_ema.py --batch_size 256 --epochs 5 --lr 5e-7 --vision_model ViT-B/32 --noise_ratio ${NOISE RATIO} --num_anns ${5 or 1} --dataset_root ${YOUR PATH} --dataset coco --checkpoint_path ${YOUR PATH}

For training CLIP Thanks to this project for providing a basic fine-tuning framework of CLIP. We have improved the code of the data loading process and the model evaluation. The --num_anns should be set to 5 for MSCOCO and Flickr30K, and 1 for CC95K. You can fine-tune the CLIP via the following command.

python main_CLIP.py --batch_size 256 --epochs 5 --lr 5e-7 --vision_model ViT-B/32 --noise_ratio ${NOISE RATIO} --num_anns ${5 or 1} --dataset_root ${YOUR PATH} --dataset coco --checkpoint_path ${YOUR PATH}

Models and Evaluation

You can download the models fine-tuned using GLP(ours) and CLIP(our baseline) from this link. (upload soon)

Save the models in folder ./pre-trained_models, and evaluate the models via the following command. For example, evaluate the models trained on MSCOCO with 60% noise.

python main_test.py --eval --resume /checkpoint_path --dataset_root /data_root/coco --dataset coco

Thanks to the project !

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
GLP		GLP
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Requirements

Data Preparation

Split Dataset

Construct Noisy Datasets

Note!

Download Link

Training

Models and Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Introduction

Requirements

Data Preparation

Split Dataset

Construct Noisy Datasets

Note!

Download Link

Training

Models and Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages