Skip to content

zzezze/NeighborRetr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval (CVPR'2025 🔥)

Conference Project Paper Stars

The official implementation of CVPR 2025 paper: NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval.

TL;DR: NeighborRetr tackles the hubness problem in cross-modal retrieval by distinguishing between good hubs (relevant) and bad hubs (irrelevant) during training, offering a direct solution rather than relying on post-processing methods that require prior data distributions.

📌 Citation

If you find this paper useful, please consider starring 🌟 this repo and citing 📑 our paper:

@article{lin2025neighborretr,
  title={NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval},
  author={Lin, Zengrong and Wang, Zheng and Qian, Tianwen and Mu, Pan and Chan, Sixian and Bai, Cong},
  journal={arXiv preprint arXiv:2503.10526},
  year={2025}
}

🌟 Overview

The hubness problem in cross-modal retrieval refers to the phenomenon where certain items (hubs) frequently emerge as the nearest neighbors to many other samples, while the majority of samples rarely appear as neighbors. This leads to biased representations and degraded retrieval accuracy. Unlike previous approaches that apply post-hoc normalization techniques during inference, NeighborRetr introduces a novel approach that:

  • Distinguishes between good hubs (semantically relevant) and bad hubs (semantically irrelevant)
  • Applies adaptive neighborhood adjustment during training
  • Employs uniform regularization to balance hub formation

😍 Visualization

Our method significantly improves the quality of nearest neighbors, reducing irrelevant hubs and promoting more meaningful semantic relationships:

🔄 Updates

  • [2025/04/13]: Code released! 🎉
  • [2025/03/14]: Initial version submitted to arXiv.
  • [2025/02/27]: Our paper is accepted to CVPR 2025!

🚀 Quick Start

Setup

Environment Setup

# Create and activate conda environment
conda create -n NeighborRetr python=3.8 -y
conda activate NeighborRetr

# Install dependencies
pip install -r requirements.txt

Download CLIP Model

cd NeighborRetr/models
wget https://openaipublic.azureedge.net/clip/models/40d365715913c9da98579312b702a82c18be219cc2a73407c4526f58eba950af/ViT-B-32.pt
# Optional: for ViT-B-16
# wget https://openaipublic.azureedge.net/clip/models/5806e77cd80f8b59890b7e101eabd078d9fb84e6937f9e85e4ecb61988df416f/ViT-B-16.pt

Download Datasets

Datasets Baidu Yun
MSR-VTT Download
MSVD Download
ActivityNet Download
DiDeMo Download

Training

Train on MSR-VTT

CUDA_VISIBLE_DEVICES=0,1,2,3 \
python -m torch.distributed.launch \
--master_port 4501 \
--nproc_per_node=4 \
main_retrieval.py \
--do_train 1 \
--workers 8 \
--epochs 5 \
--batch_size 128 \
--batch_size_val 128 \
--anno_path ${ANNO_PATH} \
--video_path ${VIDEO_PATH} \
--datatype msrvtt \
--max_words 24 \
--max_frames 12 \
--output_dir ${OUTPUT_PATH} \
--mb_batch 15 \
--memory_size 512

Train on ActivityNet Captions

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
python -m torch.distributed.launch \
--master_port 4501 \
--nproc_per_node=8 \
main_retrieval.py \
--do_train 1 \
--workers 8 \
--epochs 10 \
--batch_size 128 \
--batch_size_val 128 \
--anno_path ${ANNO_PATH} \
--video_path ${VIDEO_PATH} \
--datatype activity \
--max_words 64 \
--max_frames 64 \
--output_dir ${OUTPUT_PATH} \
--mb_batch 15 \
--memory_size 1024

📚 License

This repository is released under the Apache License 2.0. This permissive license allows users to freely use, modify, distribute, and sublicense the code while maintaining copyright and license notices.

✨ Acknowledgement

Our work is primarily built upon HBI, CLIP, CLIP4Clip. We extend our gratitude to all these authors for their generously open-sourced code and their significant contributions to the community.

About

Official implementation of "NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval (CVPR 2025)"

Topics

Resources

License

Stars

Watchers

Forks