β If you find Caliby useful, please consider giving it a star!
Caliby is an embeddable vector database designed for AI applications that need to scale beyond available memory without the complexity of distributed systems. Unlike client-server vector databases that require separate infrastructure, Caliby runs directly inside your application for embedded use cases with rich features including document storage, vector search, and filtered search.
| Solution | Limitation |
|---|---|
| HNSWLib / Faiss / Usearch | Memory-only β crash or slow down when data exceeds RAM |
| Pinecone / Weaviate / Qdrant | Requires separate server infrastructure, network latency, operational overhead |
| ChromaDB / LanceDB | Limited indexing options, no true buffer pool for efficient larger-than-memory |
Caliby combines the simplicity of an embedded library with the scalability of disk-based storage while maintaining memory-fast vector search when data fits in memory:
- π Zero Infrastructure:
pip install calibyβ no Docker, no servers, no configuration - π¦ Ship with Your App: Bundle Caliby directly into desktop apps, edge devices, or microservices
- πΎ 1B+ Vectors on a Laptop: Handle datasets far larger than RAM with intelligent buffer management
- β‘ In-Memory Performance: When data fits in RAM, matches or exceeds HNSWLib/Faiss speed
- π Graceful Degradation: As data grows beyond RAM, performance degrades smoothly β not catastrophically
- π€ AI Agents β Persistent memory that survives restarts, scales with conversation history
- π± Desktop/Mobile Apps β Local-first semantic search without cloud dependencies
- π§ Developer Tools β Embed code search, documentation retrieval in IDEs and CLIs
- π Edge Computing β Run on resource-constrained devices without network access
- π§ͺ Rapid Prototyping β Go from idea to working RAG pipeline in minutes, not hours
- π Embeddable: Single-process library, runs in your application's memory space
- πΎ Larger-Than-Memory: Innovative buffer pool handles datasets 10-100x larger than RAM
- π Document Storage: Store vectors, text, and metadata with flexible schemas
- π Filtered Search: Efficient vector search with metadata filtering
- π Hybrid Search: Combine vector similarity and BM25 full-text search
- π₯ In-Memory Speed: Matches HNSWLib/Faiss when data fits in RAM
- π― Multiple Index Types: HNSW, DiskANN, IVF+PQ, B+tree, and Inverted Index
Caliby excels where other vector databases struggle β embedded scenarios with large datasets:
| Use Case | Why Caliby? | Example |
|---|---|---|
| π€ Agentic Memory Store | Persistent agent memory that grows unbounded, survives restarts, no external DB needed | agentic_memory_store.py |
| π RAG Pipeline | Index millions of document chunks locally, hybrid search without API latency | rag_pipeline.py |
| π Recommendation System | Ship recommendations with your app, works offline on edge devices | recommendation_system.py |
| π Semantic Search | Local-first search for desktop apps, developer tools, and offline-capable systems | semantic_search.py |
| πΌοΈ Image Similarity | Visual search embedded in photo apps, no cloud upload required | image_similarity_search.py |
From PyPI (Recommended):
pip install calibyFrom Source:
git clone --recursive https://github.com/zxjcarrot/caliby.git
cd caliby
pip install -e .Note: The --recursive flag is required to initialize the pybind11 submodule. If you already cloned without it, run:
git submodule update --init --recursiveThe Collection API provides a high-level interface for storing documents with vectors, text, and metadata:
import caliby
import numpy as np
# Initialize and create a collection
caliby.set_buffer_config(size_gb=1.0)
caliby.open('/tmp/my_database')
collection = caliby.create_collection("products")
# Define schema
collection.set_schema({
"embedding": {"type": "vector", "dim": 128},
"description": {"type": "text"},
"category": {"type": "metadata"}
})
# Add documents
collection.add_documents([
{"id": "1", "embedding": np.random.rand(128).astype('float32'),
"description": "Wireless headphones", "category": "electronics"},
{"id": "2", "embedding": np.random.rand(128).astype('float32'),
"description": "Running shoes", "category": "sports"}
])
# Create indices
collection.create_hnsw_index("embedding", m=16, ef_construction=200)
collection.create_text_index("description")
collection.create_metadata_index("category")
# Vector search
query = np.random.rand(128).astype('float32')
results = collection.search_vector("embedding", query, k=10,
filter={"category": "electronics"})
# Hybrid search (vector + text)
results = collection.search_hybrid("embedding", query,
text_field="description",
text_query="wireless", k=10, alpha=0.5)
caliby.close()π See docs/COLLECTION_API.md for complete documentation including advanced filtering, best practices, and performance tuning.
For direct control over indices:
import caliby
import numpy as np
# Initialize the system and configure buffer pool
caliby.set_buffer_config(size_gb=1.0) # Set buffer pool size
caliby.open('/tmp/caliby_data') # Initialize catalog
# Create an HNSW index
index = caliby.HnswIndex(
max_elements=1_000_000, # Maximum number of vectors
dim=128, # Vector dimension
M=16, # HNSW parameter (connections per node)
ef_construction=200, # Construction-time search depth
enable_prefetch=True, # Enable prefetching for performance
skip_recovery=False, # Whether to skip recovery from disk
index_id=0, # Unique index identifier for multi-index
name='user_embeddings', # Optional human-readable name
)
# Add vectors (batch)
vectors = np.random.rand(10000, 128).astype(np.float32)
index.add_points(vectors, num_threads=4) # Parallel insertion
# Get index info
print(f"Index name: {index.get_name()}") # Output: 'user_embeddings'
print(f"Dimension: {index.get_dim()}")
# Search (single query)
query = np.random.rand(128).astype(np.float32)
labels, distances = index.search_knn(query, k=10, ef_search_param=50)
# Batch search (parallel)
queries = np.random.rand(100, 128).astype(np.float32)
results = index.search_knn_parallel(queries, k=10, ef_search_param=50, num_threads=4)
# Close when done
caliby.close()Best for: High recall requirements, moderate to large dataset sizes
import caliby
import numpy as np
# Initialize system
caliby.set_buffer_config(size_gb=2.0)
caliby.open('/tmp/caliby_data')
index = caliby.HnswIndex(
max_elements=1_000_000,
dim=128,
M=16, # Higher = better recall, more memory
ef_construction=200, # Higher = better graph quality, slower build
enable_prefetch=True, # Enable prefetching
skip_recovery=False,
index_id=0, # Unique ID for multi-index support
name='my_vectors', # Optional human-readable name
)
# Add points
vectors = np.random.rand(100000, 128).astype(np.float32)
index.add_points(vectors, num_threads=4)
# Search with ef_search_param
query = np.random.rand(128).astype(np.float32)
labels, distances = index.search_knn(query, k=10, ef_search_param=100)Best for: Filtered search, dynamic updates, very large graphs with tags/labels
import caliby
import numpy as np
# Initialize system
caliby.set_buffer_config(size_gb=2.0)
caliby.open('/tmp/caliby_data')
# Create DiskANN index
index = caliby.DiskANN(
dimensions=128,
max_elements=1_000_000,
R_max_degree=64, # Max graph degree (R)
is_dynamic=True # Enable dynamic inserts/deletes
)
# Build index with tags for filtering
vectors = np.random.rand(100000, 128).astype(np.float32)
tags = [[i % 100] for i in range(100000)] # Tags for filtering
params = caliby.BuildParams()
params.L_build = 100 # Build-time search depth
params.alpha = 1.2 # Alpha parameter for Vamana
params.num_threads = 4
index.build(vectors, tags, params)
# Search with params
search_params = caliby.SearchParams(L_search=50)
search_params.beam_width = 4
query = np.random.rand(128).astype(np.float32)
labels, distances = index.search(query, K=10, params=search_params)
# Filtered search (only return vectors with specific tag)
labels, distances = index.search_with_filter(query, filter_label=42, K=10, params=search_params)
# Dynamic operations (if is_dynamic=True)
new_point = np.random.rand(128).astype(np.float32)
index.insert_point(new_point, tags=[99], external_id=100000)
index.lazy_delete(external_id=100000)
index.consolidate_deletes(params)Best for: Very large datasets (10M+ vectors), memory-constrained environments
import caliby
import numpy as np
# Initialize system with buffer pool
caliby.set_buffer_config(size_gb=0.5) # Small buffer for large datasets
caliby.open('/tmp/caliby_data')
index = caliby.IVFPQIndex(
max_elements=10_000_000,
dim=128,
num_clusters=256, # Number of IVF clusters (K)
num_subquantizers=8, # Number of PQ subquantizers (M), dim must be divisible by this
retrain_interval=10000, # Retrain centroids every N insertions
skip_recovery=False,
index_id=0,
name='large_dataset'
)
# Train the index first (required for IVF+PQ)
training_data = np.random.rand(50000, 128).astype(np.float32)
index.train(training_data)
# Add points (after training)
vectors = np.random.rand(1000000, 128).astype(np.float32)
index.add_points(vectors, num_threads=4)
# Search with nprobe parameter
query = np.random.rand(128).astype(np.float32)
labels, distances = index.search_knn(query, k=10, nprobe=8)import caliby
# Indexes are automatically persisted via the buffer pool
caliby.set_buffer_config(size_gb=1.0)
caliby.open('/path/to/caliby_data') # Data directory for persistent storage
# Create index (will be persisted automatically)
index = caliby.HnswIndex(
max_elements=1_000_000,
dim=128,
M=16,
ef_construction=200,
enable_prefetch=True,
skip_recovery=False, # Set to False to enable recovery
index_id=1,
name='my_index'
)
# Manual flush to ensure all data is written
index.flush()
# Recovery happens automatically when reopening with same directory
caliby.close()
# Later: reopen and recover
caliby.open('/path/to/caliby_data')
recovered_index = caliby.HnswIndex(
max_elements=1_000_000,
dim=128,
M=16,
ef_construction=200,
enable_prefetch=True,
skip_recovery=False, # Will recover existing index
index_id=1, # Must match original
name='my_index'
)
if recovered_index.was_recovered():
print("Index successfully recovered from disk!")caliby/
βββ include/caliby/ # C++ headers
β βββ calico.hpp # Core buffer pool system
β βββ hnsw.hpp # HNSW index
β βββ ivfpq.hpp # IVF+PQ index
β βββ diskann.hpp # DiskANN index (experimental)
β βββ catalog.hpp # Index catalog management
β βββ distance.hpp # Distance functions
βββ src/ # C++ implementation
β βββ bindings.cpp # Python bindings
β βββ hnsw.cpp
β βββ ivfpq.cpp
β βββ calico.cpp
βββ examples/ # Usage examples
βββ benchmark/ # Performance benchmarks
βββ tests/ # Python tests
βββ third_party/ # Dependencies
βββ pybind11/ # Python binding library (submodule)
Caliby requires the following system dependencies:
- C++17 compatible compiler (GCC 9+ or Clang 10+)
- CMake 3.15+
- OpenMP
- Abseil C++ library
- Python 3.8+
Ubuntu/Debian:
sudo apt-get update
sudo apt-get install -y build-essential cmake libomp-dev libabsl-dev python3-devFedora/RHEL:
sudo dnf install -y gcc-c++ cmake libomp-devel abseil-cpp-devel python3-develgit clone https://github.com/zxjcarrot/caliby.git
cd caliby
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)
# Install Python package
cd ..
pip install -e .# C++ tests
cd build && ctest --output-on-failure
# Python tests
pytest python/tests/- Collection API Guide - High-level API for documents with vectors, text, and metadata
- Usage Guide - General usage patterns and examples
- Benchmarks - Performance comparisons and benchmarking tools
Unlike in-memory libraries that crash or grind to a halt when data exceeds RAM, Caliby uses a database-style buffer pool:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Your Application β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Caliby (Embedded Library) β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Buffer Pool (RAM) β β
β β βββββββ βββββββ βββββββ βββββββ βββββββ βββββββ β β
β β βHot β βHot β βWarm β βWarm β βCold β βCold β ... β β
β β βPage β βPage β βPage β βPage β βPage β βPage β β β
β β βββββββ βββββββ βββββββ βββββββ βββββββ βββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β² β β
β Evict β β Parallel Fetch on β
β Cold β β Access β
β β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Disk Storage β β
β β (SSD/NVMe) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Key Insight: Most vector search workloads have locality β recently accessed vectors are likely to be accessed again. Caliby exploits this by keeping hot data in RAM and seamlessly paging cold data to disk.
| Data Size vs RAM | Caliby Behavior |
|---|---|
| Data < RAM | π Full in-memory speed (matches HNSWLib) |
| Data β RAM | β‘ Mostly in-memory, occasional disk reads |
| Data >> RAM | πΎ Working set in memory, graceful disk access |
We welcome contributions! Please see our Contributing Guide for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: xinjing@mit.edu
