The vector database that fits in your pocket.
Rust-powered. Python-native. One pip install away.
pip install vxdbimport vxdb
db = vxdb.Database(path="./my_data") # persistent β data survives restarts
collection = db.create_collection("docs", dimension=384)
embed = your_embedding_function # OpenAI, Sentence Transformers, Cohere, etc.
collection.upsert(
ids=["a", "b"],
vectors=[embed("how to train a model"), embed("best pasta recipe")],
documents=["how to train a model", "best pasta recipe"],
)
collection.query(vector=embed("machine learning"), top_k=5)embed() is any function that turns text into vectors β see examples/ for OpenAI, Sentence Transformers, LangChain, and Cohere.
That's it. No Docker. No config files. No cloud account. No 500 MB of dependencies.
The entire hot path β distance computation, HNSW traversal, BM25 scoring, mmap I/O β is pure Rust with zero GIL contention. Your Python code calls directly into compiled native code via PyO3. No serialization overhead. No REST round-trips. No subprocess.
A single native wheel under 5 MB. Starts in under 10 ms. Compare that to ChromaDB (~200 MB, ~2s startup), Milvus (needs Docker + etcd + MinIO), or Pinecone (needs a cloud account and an internet connection).
Laptop. CI pipeline. Raspberry Pi. AWS Lambda. Docker container. Air-gapped server. Anywhere Python runs, vxdb runs. No infrastructure required to get started β scale up to a standalone server when you need it.
Vector similarity + BM25 keyword matching fused via Reciprocal Rank Fusion. One API call. Tunable alpha parameter. No separate search engine needed. No Elasticsearch sidecar.
Compare this with Zvec (Alibaba's in-process vector DB): their "hybrid search" is vector + structured metadata filters β not the same thing as full-text keyword search. If a user searches for a term that doesn't embed well (error codes, product SKUs, proper nouns), vxdb's BM25 catches it. Zvec won't.
Most in-process vector databases (Zvec, FAISS) can only run inside your process. Most server-based databases (Qdrant, Milvus) require Docker. vxdb does both β same Rust engine, same API. Start embedded in a notebook, scale to a multi-client REST server when you're ready. No rewrite.
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Your Python Code β
βββββββββββββββ¬ββββββββββββββββββββ¬ββββββββββββββββ
β β
βββββββββββββββΌβββββββ ββββββββββΌβββββββββββββ
β Embedded (PyO3) β β Server (REST API) β
β Zero-copy, in- β β Axum, async, β
β process, <1ΞΌs β β multi-client β
β call overhead β β β
βββββββββββββββ¬βββββββ ββββββββββ¬βββββββββββββ
β β
βββββββββββββββΌββββββββββββββββββββΌββββββββββββββββ
β Rust Core Engine β
β β
β ββββββββββββ ββββββββββββ βββββββββββββββββββ β
β β HNSW β β Flat β β BM25 Keyword β β
β β Index β β Index β β Index β β
β ββββββββββββ ββββββββββββ βββββββββββββββββββ β
β ββββββββββββββββββββ ββββββββββββββββββββββββ β
β β Distance Metrics β β Metadata Filtering β β
β β cosine/L2/dot β β 10 operators, SQL β β
β ββββββββββββββββββββ ββββββββββββββββββββββββ β
β ββββββββββββββββββββββββββββββββββββββββββββ β
β β Hybrid Search (Reciprocal Rank Fusion) β β
β ββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββΌββββββββββββββββββββββββββββ
β Storage β
β mmap vectors β SQLite metadata β Write-Ahead Logβ
βββββββββββββββββββββββββββββββββββββββββββββββββββ
import vxdb
# Persistent (data survives restarts)
db = vxdb.Database(path="./my_data")
# Or in-memory (ephemeral, great for prototyping)
# db = vxdb.Database()
collection = db.create_collection("docs", dimension=384, metric="cosine")collection.upsert(
ids=["a", "b", "c"],
vectors=[[0.1, 0.2, ...], [0.3, 0.4, ...], [0.5, 0.6, ...]],
metadata=[{"type": "article"}, {"type": "blog"}, {"type": "article"}],
documents=["intro to ML", "my favorite recipes", "deep learning guide"],
)# 1. Vector similarity
results = collection.query(vector=[0.1, 0.2, ...], top_k=5)
# 2. Filtered (metadata constraints)
results = collection.query(
vector=[0.1, ...], top_k=5,
filter={"type": {"$eq": "article"}}
)
# 3. Hybrid (vector + keyword β the sweet spot)
results = collection.hybrid_query(
vector=[0.1, ...],
query="machine learning",
top_k=5,
alpha=0.5, # 0=keyword only, 1=vector only
)
# 4. Keyword only (BM25)
results = collection.keyword_search(query="machine learning", top_k=5)Every result returns {"id", "score", "metadata"}.
pip install vxdbThat's the whole thing. Works on macOS, Linux, Windows. Python 3.9+.
For the HTTP client (talking to a remote vxdb server):
pip install 'vxdb[server]'vxdb stores pre-computed vectors β bring any embedding model you want. We have step-by-step notebooks for each:
| Provider | Install | API Key? | Notebook |
|---|---|---|---|
| OpenAI | pip install openai |
Yes | [examples/openai_embeddings.ipynb](examples/openai_embeddings.ipynb) |
| Sentence Transformers | pip install sentence-transformers |
No (local) | [examples/sentence_transformers.ipynb](examples/sentence_transformers.ipynb) |
| LangChain (any provider) | pip install langchain-openai |
Depends | [examples/langchain_integration.ipynb](examples/langchain_integration.ipynb) |
| Cohere | pip install cohere |
Yes | [examples/cohere_embeddings.ipynb](examples/cohere_embeddings.ipynb) |
| Ollama (local LLMs) | pip install ollama |
No (local) | β |
Or use the pluggable interface:
from vxdb.embedding import EmbeddingFunction
class MyEmbedder(EmbeddingFunction):
def embed(self, texts: list[str]) -> list[list[float]]:
return your_model.encode(texts)Same engine, accessed over HTTP. Deploy it as a standalone service.
# Start the server
vxdb-server --host 0.0.0.0 --port 8080Python client:
from vxdb import Client
client = Client("http://localhost:8080")
coll = client.create_collection("docs", dimension=384)
coll.upsert(ids=["a"], vectors=[[0.1, ...]], documents=["hello world"])
results = coll.hybrid_query(vector=[0.1, ...], query="hello", top_k=5)cURL:
# Create collection
curl -X POST localhost:8080/collections \
-H "Content-Type: application/json" \
-d '{"name": "docs", "dimension": 384}'
# Upsert
curl -X POST localhost:8080/collections/docs/upsert \
-H "Content-Type: application/json" \
-d '{"ids": ["a"], "vectors": [[0.1, 0.2]], "documents": ["hello world"]}'
# Query
curl -X POST localhost:8080/collections/docs/query \
-H "Content-Type: application/json" \
-d '{"vector": [0.1, 0.2], "top_k": 5}'Docker:
docker build -t vxdb .
docker run -p 8080:8080 vxdb # ~10 MB imageMost vector databases give you vector search OR keyword search. vxdb gives you both, fused intelligently in a single call.
How it works:
- You upsert with documents β raw text is tokenized into a built-in BM25 index alongside your vectors
- At query time β vector search and BM25 run in parallel, then Reciprocal Rank Fusion merges both ranked lists
- You control the blend β
alpha=1.0(pure vector) βalpha=0.5(balanced) βalpha=0.0(pure keyword)
When to use it: Specific product names. Error codes. Proper nouns. Anything where exact terms matter alongside semantic meaning. See [examples/hybrid_search.ipynb](examples/hybrid_search.ipynb) for a deep dive with side-by-side comparisons.
results = collection.hybrid_query(
vector=embed("lightweight laptop for students"),
query="MacBook Air M4",
top_k=5,
alpha=0.5,
)| vxdb | Zvec (Alibaba) | ChromaDB | Qdrant | Pinecone | Milvus | Weaviate | FAISS | |
|---|---|---|---|---|---|---|---|---|
| Language | Rust | C++ (Proxima) | Python | Rust | Proprietary | Go/C++ | Go | C++ |
| Embedded mode | PyO3, zero-copy | In-process | Python-speed | No | No | No | No | SWIG bindings |
| Server mode | Yes | No | Yes | Yes | Cloud only | Yes | Yes | No |
**pip install just works** |
Yes | Yes | Yes | No (Docker) | N/A (SaaS) | No (Docker) | No (Docker) | Yes |
| Binary size | ~5 MB | ~30 MB | ~200 MB+ | ~50 MB | N/A | ~500 MB+ | ~100 MB+ | ~20 MB |
| Startup time | <10 ms | <100 ms | ~1-2 s | ~1-3 s | N/A | ~5-10 s | ~3-5 s | <10 ms |
| Hybrid search | BM25 + RRF | Vector + filters only | No | Requires setup | No | Sparse vectors | BM25 | No |
| BM25 keyword search | Built-in | No | No | No | No | No | BM25 | No |
| Sparse vectors | No | Yes | No | Yes | No | Yes | No | No |
| Multi-vector queries | No | Yes | No | No | No | No | No | No |
| Metadata filtering | 10 operators | Structured filters | Yes | Yes | Yes | Yes | Yes | No |
| Persistence | mmap + SQLite + WAL | Custom engine | SQLite + Parquet | RocksDB | Cloud | RocksDB | LSM | Manual |
| Crash recovery | WAL | Yes | No | Yes | Yes | Yes | Yes | No |
| Quantization | No (planned) | int8 | No | Scalar/PQ | Yes | Yes | PQ/BQ | PQ/SQ |
| Docker image | ~10 MB | N/A (no server) | ~500 MB+ | ~100 MB | No | ~1 GB+ | ~300 MB+ | No |
| Runs offline | Yes | Yes | Yes | Yes | No | Yes | Yes | Yes |
| License | Apache 2.0 | Apache 2.0 | Apache 2.0 | Apache 2.0 | Proprietary | Apache 2.0 | BSD-3 | MIT |
# Database
db = vxdb.Database() # in-memory (ephemeral)
db = vxdb.Database(path="./my_data") # persistent (data survives restarts)
db.create_collection(name, dimension, metric="cosine", index="flat")
db.get_collection(name)
db.list_collections()
db.delete_collection(name)
# Collection
collection.upsert(ids, vectors, metadata=None, documents=None)
collection.query(vector, top_k=10, filter=None)
collection.hybrid_query(vector, query, top_k=10, alpha=0.5)
collection.keyword_search(query, top_k=10)
collection.delete(ids)
collection.count()| Method | Endpoint | Description |
|---|---|---|
POST |
/collections |
Create collection |
GET |
/collections |
List collections |
DELETE |
/collections/{name} |
Delete collection |
POST |
/collections/{name}/upsert |
Upsert vectors (+ optional documents) |
POST |
/collections/{name}/query |
Vector search (+ optional filter) |
POST |
/collections/{name}/hybrid |
Hybrid vector + keyword search |
POST |
/collections/{name}/keyword |
BM25 keyword search |
POST |
/collections/{name}/delete |
Delete vectors by ID |
GET |
/collections/{name}/count |
Count vectors |
| Parameter | Values | Default |
|---|---|---|
metric |
"cosine", "euclidean", "dot" |
"cosine" |
index |
"flat" (exact), "hnsw" (approximate) |
"flat" |
filter |
$eq $ne $gt $gte $lt $lte $in $nin $and $or |
β |
alpha |
0.0 (keyword) to 1.0 (vector) |
0.5 |
Interactive Jupyter notebooks with step-by-step walkthroughs:
| Notebook | What you'll build |
|---|---|
[quickstart.ipynb](examples/quickstart.ipynb) |
Every feature in 5 min (no API keys) |
[openai_embeddings.ipynb](examples/openai_embeddings.ipynb) |
Semantic search with OpenAI embeddings |
[sentence_transformers.ipynb](examples/sentence_transformers.ipynb) |
Free, local embeddings (no API key) |
[langchain_integration.ipynb](examples/langchain_integration.ipynb) |
LangChain + RAG pipeline |
[cohere_embeddings.ipynb](examples/cohere_embeddings.ipynb) |
Multilingual search with Cohere |
[hybrid_search.ipynb](examples/hybrid_search.ipynb) |
Deep dive: vector vs keyword vs hybrid |
git clone https://github.com/your-org/vxdb.git && cd vxdb
# Rust
cargo build --all
cargo test --all # 120+ tests
# Python
uv venv .venv && source .venv/bin/activate
uv pip install maturin pytest httpx
maturin develop
PYTHONPATH=python pytest tests/ -vThe codebase is a Cargo workspace:
vxdb/
βββ crates/
β βββ vxdb-core/ # Engine: indexes, distance, storage, hybrid search
β βββ vxdb-python/ # PyO3 bindings
β βββ vxdb-server/ # Axum REST API server
βββ python/vxdb/ # Python package (client SDK, embedding interface)
βββ examples/ # Jupyter notebooks
βββ tests/ # Python integration tests
Persistent collections (mmap + SQLite + WAL)Done- SIMD-accelerated distance computation
- Quantization (int8/binary) for reduced memory
- GPU acceleration (CUDA/Metal)
- HNSW graph serialization (fast restart for large indexes)
- Streaming upsert for large datasets
- Sparse vector support
- gRPC API
- Official LangChain
VectorStoreintegration - Kubernetes Helm chart
- Benchmarks suite vs Qdrant, ChromaDB, Zvec, FAISS
Apache 2.0