DenserNet uses multiple-semantics fusion for image-based localization (as shown in the above figure), which leverages the image-level supervision (positive and negative image pairs) without feature correspondences. This repo is the PyTorch implementation of AAAI2021 paper "DenserNet: Weakly Supervised Visual Localization Using Multi-scale Feature Aggregation."
[pdf] [project page]
Please find detailed steps Here for installation and dataset preparation.
Please find details Here for step-by-step instructions.
Please refer to Here for trained models.
Please refer to Here for inference on a single image.
Please refer to Here to prepare your own dataset.
DenserNet is released under the MIT license.
If you find this repo useful for your research, please consider citing the paper
@article{liu2020densernet,
title={DenserNet: Weakly Supervised Visual Localization Using Multi-scale Feature Aggregation},
author={Liu, Dongfang and Cui, Yiming and Yan, Liqi and Mousas, Christos and Yang, Baijian and Chen, Yingjie},
journal={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2021},
month={May}, pages={6101-6109}
}
We truely thanksful of the following piror efforts in terms of knowledge contributions and open-source repos. Particularly, "ASLFeat" has a similar approach to ours but using strong supervision.
- NetVLAD: CNN architecture for weakly supervised place recognition (CVPR'16) [paper] [official code (pytorch-NetVlad)]
- SARE: Stochastic Attraction-Repulsion Embedding for Large Scale Image Localization (ICCV'19) [paper] [official code (MatConvNet)]
- ASLFeat: Learning Local Features of Accurate Shape and Localization (CVPR'20) [paper] [official code]
