GitHub - llqy123/CADet: [MM 2025] SAM based Region-Word Clustering and Inference Score Adjusting for Open-Vocabulary Object Detection

✨SAM based Region-Word Clustering and Inference Score Adjusting for Open-Vocabulary Object Detection

This repository is the official implementation of SAM based Region-Word Clustering and Inference Score Adjusting for Open-Vocabulary Object Detection accepted by ACM MM 2025.

Qiuyu Liang, Yongqiang Zhang*

📍Installation

Requirements

Linux with Cuda == 11.1.
Python == 3.8.20.
PyTorch == 1.9.0. Install them together at pytorch.org to make sure of this. Note, please check PyTorch version matches that is required by Detectron2.
Detectron2: follow Detectron2 installation instructions.

Example conda environment setup

conda create --name cadet python=3.8.20 -y
conda activate cadet
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html -i https://pypi.tuna.tsinghua.edu.cn/simple

# under your working directory

git clone https://github.com/llqy123/CADet.git
cd CADet
cd detectron2
pip install -e .
cd ..
pip install -r requirements.txt

🚀Benchmark evaluation and training

Please first prepare datasets.

Similar to baseline VLDet, our CADet models are finetuned on the corresponding Box-Supervised models (indicated by MODEL.WEIGHTS in the config files). Please train or download the Box-Supervised model and place them under CADet_ROOT/models/ before training the CADet models.

To train a model with OV-COCO dataset, run

python train_net.py --num-gpus 8 --config-file configs/CADet_OVCOCO_CLIP_R50_1x_caption.yaml

To evaluate a model with a trained weight, run

python train_net.py --num-gpus 8 --config-file configs/CADet_OVCOCO_CLIP_R50_1x_caption.yaml --eval-only MODEL.WEIGHTS /path/to/weight.pth

Download the trained network weights.

OV-COCO	Novel AP50	Base AP50	Overall AP50	Weight
config_RN50	36.4	50.6	46.9	paper
config_RN50	36.4	51.0	47.2	weight

🙏Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{liang2025sam,
  title={SAM based Region-Word Clustering and Inference Score Adjusting for Open-Vocabulary Object Detection},
  author={Liang, Qiuyu and Zhang, Yongqiang},
  booktitle={Proceedings of the 33rd ACM International Conference on Multimedia},
  pages={2596--2605},
  year={2025}
}

🤝Acknowledgement

This repository was built on top of Detectron2, Detic, RegionCLIP, OVR-CNN, VLDet and Segment Anything Model. We thank for their hard work.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
CenterNet2		CenterNet2
cadet		cadet
cadetdata		cadetdata
configs		configs
detectron2		detectron2
tools		tools
LICENSE		LICENSE
README.md		README.md
coco_result.png		coco_result.png
prepare_datasets.md		prepare_datasets.md
requirements.txt		requirements.txt
train_net.py		train_net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✨SAM based Region-Word Clustering and Inference Score Adjusting for Open-Vocabulary Object Detection

📍Installation

Requirements

Example conda environment setup

🚀Benchmark evaluation and training

🙏Citation

🤝Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

✨SAM based Region-Word Clustering and Inference Score Adjusting for Open-Vocabulary Object Detection

📍Installation

Requirements

Example conda environment setup

🚀Benchmark evaluation and training

🙏Citation

🤝Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages