This repo is the core logic of Monument Recognition in real-time. It initially extracts feature vectors of multiple monuments and store those in a vector database for efficient similarity search. During inference, at first YOLO filters out probable regions of monuments in the frame, then it retrieves top-k similar images from the database as per the query image then reranks the top-k results based on local features for robustness.
For global feature vector extraction, we are using Hugging-Face timm models with option of Efficient Net or MobileNetV3 as the backbone. And, feature vector is passed Generalized Mean Pooling layer. Finally, Milvus DB is used to store feature vectors and perform similarity search.
- Install anaconda/miniconda for environment management
- Create Conda env from yaml file
# conda env create -f <yaml-file> conda env create -f env.yml - Activate the environment
# conda activate <env-name> conda activate cv-travel - Run the backend server
python main.py
- main.py
The FastAPI backend server code with pipeline - img.py
Python script to test pipeline for an image - video.py
Python script to test pipeline for a video - index_imgs.py
Python script to index images in the vector database as well as search images by landmark name - libs/
Contains external modules and libraries like LightGlue - src/
Contains source files - utils/
Contains utility functions and classes

