MetaFusion is a distributed photo storage and search system that combines metadata filtering with vector similarity search for efficient image retrieval.
New Feature: 🎉 Search Method Comparison - Compare MetaFusion, Vector-only, and Metadata-only search approaches!
python main.py leaderor with custom settings:
python main.py leader --host <leader_host> --port <leader_port> --base_dir <base_dir> --model_name <model_name> --device <device>python main.py follower --port 9000or with custom settings:
python main.py follower --host <follower_host> --port <follower_port> --leader_host <leader_host> --leader_port <leader_port>Once the leader node is running, you can use the following commands:
| Command | Description | Example |
|---|---|---|
ls |
List all follower nodes | ls |
upload <path> |
Upload a single image | upload photo.jpg |
mass_upload <dir> |
Upload all images in a directory | mass_upload ./photos |
search <prompt> |
MetaFusion search (default) | search a beach photo |
search_metadata <prompt> |
Metadata-only search | search_metadata photo in 2023 |
search_vector <prompt> |
Vector-only search (all silos) | search_vector sunset |
search_metafusion <prompt> |
MetaFusion search | search_metafusion cat photo |
compare <prompt> |
Compare all three methods | compare mountain in winter |
get <dir> <prompt> |
Search and download images | get ./output sunset |
clear |
Clear all data | clear |
help |
Show all commands | help |
exit / quit |
Exit the program | exit |
MetaFusion now supports comparing three different search approaches:
- Filters candidate silos using metadata (time, location, tags)
- Performs vector search only on filtered silos
- Best for: Balancing efficiency and accuracy
- Searches across all follower nodes using vector similarity
- No metadata filtering
- Best for: Maximum recall, ensuring no relevant images are missed
- Uses only metadata (timestamps, GPS, tags) for filtering
- Performed on leader node only, no vector computation
- Best for: Fastest search when metadata is sufficient
# In the leader terminal
> compare a photo taken in New York in summer 2023This will automatically:
- Run all three search methods
- Compare performance metrics
- Show search space reduction
- Display top results from each method
python test_search_comparison.pySelect option 1 for a comprehensive comparison test.
- 📘 Quick Start Guide - Get started in 5 minutes
- 📖 Detailed Comparison Guide - In-depth usage and evaluation metrics
- 🔧 Update Notes - Technical details of the new features
- ✅ Distributed architecture with leader-follower pattern
- ✅ Metadata-based pre-filtering for efficient search
- ✅ Vector similarity search using CLIP embeddings
- ✅ Three search modes: MetaFusion, Vector-only, Metadata-only
- ✅ Built-in comparison tools for evaluating search methods
- ✅ Automatic EXIF metadata extraction
- ✅ Scalable to multiple follower nodes (silos)
- Python 3.10+
- PostgreSQL database
- Dependencies: See
requirements.txt
┌─────────────────────────────────────────────────┐
│ Leader Node │
│ - Metadata Database (PostgreSQL) │
│ - Query Processing & Filtering │
│ - Result Aggregation │
└─────────────┬───────────────────────────────────┘
│
┌─────────┼─────────┐
│ │ │
▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐
│Follower│ │Follower│ │Follower│
│(Silo 0)│ │(Silo 1)│ │(Silo 2)│
│ │ │ │ │ │
│Vector │ │Vector │ │Vector │
│Index │ │Index │ │Index │
└────────┘ └────────┘ └────────┘
Example comparison results:
【Performance Comparison】
Method Time(s) Results
--------------------------------------------------
Metadata Only 0.032 45
Vector Only 5.234 128
MetaFusion 5.156 87
【Result Analysis】
MetaFusion vs Vector Only: Search space reduced by 32.0%
[Add your license here]
[Add contribution guidelines here]