Lightweight local semantic search over plain text documents using sentence embeddings.
VectorVault is a small project that extracts semantic embeddings from text documents and provides two ways to interact with them:
- a Streamlit web UI for interactive exploration and demos (fast, visual), and
- a FastAPI-based HTTP API for programmatic queries and integration.
The repository contains simple modules to preprocess text, compute and cache embeddings, and perform approximate/nearest-neighbor search.
This project is designed as a local, easy-to-run demo and development platform for semantic search over a small corpus of documents. Its goals are:
- Make it trivial to index a small document set and query semantically.
- Provide both an interactive UI (Streamlit) and an API (FastAPI) so developers and non-developers can explore results.
- Use a compact, fast embedding model (MiniLM) so it runs well on modest hardware.
Reason for choosing the MiniLM (L6) family:
- Performance vs. size:
all-MiniLM-L6-v2is small and fast while delivering strong semantic quality for many search tasks. - Low latency: great for interactive UIs and local/edge environments.
- Lower resource requirements: works on a laptop or small VM without needing a GPU.
- Easy to scale: because embeddings are compact and quick to compute, the system is cheaper and faster to run.
Tradeoffs: larger models (e.g., MPNet or transformer-based large models) can give higher accuracy on subtle semantics but require more memory, compute and latency. For a lightweight local project, MiniLM is an excellent default.
High-level flow:
- Preprocess: read raw text files from
data/docs/, clean and split them into chunks. - Embed: compute dense vector embeddings for each chunk using the MiniLM sentence-transformer model.
- Cache: store embeddings and minimal metadata in
cache/embeddings.json(managed bysrc/cache_manager.py). - Search: for a user query, compute its embedding and perform nearest-neighbor search across cached vectors (
src/search_engine.py). - Serve: expose functionality via the Streamlit UI (
app.py) and the FastAPI endpoints (src/api.py).
Components and where to find them:
data/docs/— source documents (text files).cache/— embedding cache and manifest.src/preprocess.py— helpers to load and prepare text.src/embedder.py— code that talks to the sentence-transformers model to compute embeddings.src/cache_manager.py— read/write cached vectors and metadata.src/search_engine.py— nearest-neighbor search logic.src/api.py— FastAPI app exposing endpoints for search and metadata.app.py— Streamlit UI front-end for interactive exploration.
Simplified diagram:
Text Files (data/docs/) --> Preprocess --> Embedder (MiniLM) --> Cache (cache/embeddings.json)
|
v
Search Engine <- Query Embedding
|
---------------------------------------------------------
| |
Streamlit UI (app.py) FastAPI (src/api.py)
- Create & activate a virtual environment, then install dependencies:
python -m venv .venv
# Activate the venv in PowerShell
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt- Run the Streamlit UI (interactive demo):
# from the project root
streamlit run app.pyThe Streamlit app typically opens at http://localhost:8501. Use it to upload documents, run queries and inspect results.
- Run the FastAPI server (programmatic access):
# from the project root
# serve the API on port 8000
uvicorn src.api:app --reload --port 8000The API root will be at http://127.0.0.1:8000. If the project includes an OpenAPI schema, you can view it at http://127.0.0.1:8000/docs.
- Running both at the same time
Run the Streamlit UI and the FastAPI server in separate terminals. They use different default ports (Streamlit 8501, FastAPI 8000), so they do not conflict.
Why run both? Running both provides the best of both worlds:
- Streamlit: great for manual inspection, demos, and iterating on UI/UX for search and retrieval.
- FastAPI: exposes endpoints for automated tests, integrations, or enabling multiple clients.
Example: run the API in one terminal and the Streamlit demo in another; the Streamlit app can call the local API for queries or you can hit the API directly from other programs.
A small PowerShell helper script quickstart.ps1 is provided to automate setup and optionally start both servers in new PowerShell windows.
From the project root you can:
# Install dependencies only
.\quickstart.ps1 -InstallOnly
# Create venv, install deps and start both Streamlit and FastAPI
.\quickstart.ps1 -RunBoth
# Start only Streamlit (after installing)
.\quickstart.ps1 -RunStreamlit
# Start only FastAPI (after installing)
.\quickstart.ps1 -RunAPIIf your PowerShell blocks script execution, run it temporarily for this process with:
Set-ExecutionPolicy -Scope Process -ExecutionPolicy BypassThe script will create a .venv virtual environment (if missing), install packages from requirements.txt into that venv, and open new PowerShell windows to run Streamlit and/or Uvicorn so both services can run concurrently.
- Interactive: open Streamlit, point it at
data/docs/and click the button to build embeddings and query. - Programmatic: POST a JSON payload to
/search(or whichever endpoint exists insrc/api.py) with aqueryfield and receive nearest-neighbor results as JSON.
Note: the exact API routes and function names depend on src/api.py and may be extended. Open src/api.py if you want to add or inspect endpoints.
- If embeddings are not present, check
cache/embeddings.jsonand delete it to force a rebuild. - If the model fails to load, ensure
sentence-transformersand its dependencies are installed in the active virtual environment. - Port conflicts: if port 8000 or 8501 are in use, pick alternative ports with Streamlit
--server.portor Uvicorn--portflags.