VLOD-TTA: Test-Time Adaptation of Vision-Language Object Detectors

This is the repository for our paper:

VLOD-TTA: Test-Time Adaptation of Vision-Language Object Detectors
Atif Belal, Heitor R. Medeiros, Marco Pedersoli, Eric Granger

TL;DR

VLOD-TTA adapts VL-ODs (e.g., YOLO-World, Grounding DINO) at inference with IoU-weighted entropy and image-conditioned prompt selection, optimizing lightweight adapters while preserving zero-shot capability.

News

Paper is under review, code will be released soon.
arXiv - Paper

Benchmarking

Citation

If you find our work helpful for your research, please consider citing the following BibTeX entry.

@misc{belal2025vlodtta,
      title={VLOD-TTA: Test-Time Adaptation of Vision-Language Object Detectors}, 
      author={Atif Belal and Heitor R. Medeiros and Marco Pedersoli and Eric Granger},
      year={2025},
      eprint={2510.00458},
      archivePrefix={arXiv},
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Det_viz.png		Det_viz.png
README.md		README.md
methods.png		methods.png
results.png		results.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VLOD-TTA: Test-Time Adaptation of Vision-Language Object Detectors

TL;DR

News

Benchmarking

Citation

About

Uh oh!

Releases

Packages

imatif17/VLOD-TTA

Folders and files

Latest commit

History

Repository files navigation

VLOD-TTA: Test-Time Adaptation of Vision-Language Object Detectors

TL;DR

News

Benchmarking

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages