-
Notifications
You must be signed in to change notification settings - Fork 5
Indexing
Zeppelin supports multiple vector index types, each offering different tradeoffs between accuracy, memory, and speed.
| Type | Config Value | Compression | Description |
|---|---|---|---|
| IVF-Flat | ivf_flat |
1x (none) | Full-precision IVF with k-means partitioning |
| IVF-SQ8 | ivf_sq |
4x | IVF + scalar quantization (8-bit codes) |
| IVF-PQ | ivf_pq |
16-32x | IVF + product quantization |
| Hierarchical | hierarchical |
varies | Multi-level centroid tree with beam search |
The default index type. Vectors are partitioned into clusters using k-means, and each cluster's vectors are stored in a separate S3 object.
-
Training: Run k-means++ on all vectors to find
Ncentroids (default: 256) - Assignment: Each vector is assigned to its nearest centroid
- Storage: Each cluster is serialized as a binary blob with vector IDs, values, and attribute data
-
Search: Find the
nprobenearest centroids to the query, then scan only those clusters
| Parameter | Default | Description |
|---|---|---|
default_num_centroids |
256 | Number of k-means clusters |
kmeans_max_iterations |
25 | Max k-means iterations |
kmeans_convergence_epsilon |
1e-4 | Convergence threshold |
| Parameter | Default | Description |
|---|---|---|
default_nprobe |
16 | Clusters to probe per query |
max_nprobe |
128 | Maximum allowed nprobe |
| Metric | Value | Formula |
|---|---|---|
| Cosine | cosine |
1 - cos(a, b) (0 = identical) |
| Euclidean | euclidean |
‖a - b‖² (0 = identical) |
| Dot Product | dot_product |
-a·b (more negative = more similar) |
SQ8 compresses each f32 dimension to a uint8, achieving 4x compression with minimal accuracy loss.
- Calibration: Compute per-dimension min/max across all vectors
-
Encoding: Map each f32 value to [0, 255]:
code = round((value - min) / (max - min) × 255) -
Decoding: Reverse:
value = min + code × (max - min) / 255 - Search: Asymmetric distance computation — query stays in f32, codes in uint8
segments/<segment_id>/sq_calibration.bin # [dim: u32][min_0: f32][max_0: f32]...
segments/<segment_id>/sq_cluster_<N>.bin # [n: u32][dim: u32] then per vector: [id_len: u32][id][codes: u8×dim]
-
Coarse phase: Compute asymmetric distances using SQ8 codes, collect top
k × rerank_factorcandidates - Rerank phase: Load full-precision vectors for candidates, compute exact distances, return top-k
PQ achieves 16-32x compression by splitting vectors into subspaces and encoding each subspace with a codebook.
-
Training: Split vectors into
Msubspaces of dimensionD/M. Run k-means on each subspace to learn 256 centroids (codebook) -
Encoding: Each vector becomes
Mbytes — one codebook index per subspace -
ADC (Asymmetric Distance Computation): Precompute a lookup table
[M][256]of distances from the query to each codebook centroid. Distance to any encoded vector isO(M)table lookups
| Parameter | Default | Description |
|---|---|---|
pq_m |
8 | Number of subspaces |
rerank_factor |
4 | Candidates = top_k × rerank_factor |
PQ requires dimension % pq_m == 0.
segments/<segment_id>/pq_codebook.bin # [M: u32][K: u32][sub_dim: u32][centroids: f32×M×K×sub_dim]
segments/<segment_id>/pq_cluster_<N>.bin # [n: u32][M: u32] then per vector: [id_len: u32][id][codes: u8×M]
A multi-level centroid tree that enables sub-linear search for large datasets.
- Build: Recursively apply k-means. The root level partitions all vectors. Each non-leaf cluster is further partitioned until clusters are small enough (leaf_size)
-
Search: Beam search from root to leaves. At each level, evaluate the
beam_widthmost promising centroids and descend into their children - Mixed-depth trees: Not all branches have the same depth. Some clusters may be leaves at level 1 while others descend to level 3+
| Parameter | Default | Description |
|---|---|---|
hierarchical |
false | Enable hierarchical index |
beam_width |
10 | Candidates maintained at each level |
leaf_size |
— | Target vectors per leaf cluster |
segments/<segment_id>/tree.json # Tree structure (node → children mapping)
segments/<segment_id>/centroids.bin # All centroids across all levels
segments/<segment_id>/cluster_<N>.bin # Leaf cluster data
When enabled, cluster data is additionally stored in half-precision (f16) format, achieving 50% compression over f32.
segments/<segment_id>/f16_cluster_<N>.bin # Same layout as cluster, but f16 values
RoaringBitmap indexes built per attribute field during compaction. Used to efficiently skip vectors that don't match filter criteria before distance computation.
Each bitmap file uses the ZBMP magic bytes:
segments/<segment_id>/bitmap_<field>.bin
Contains:
- Present bitmap: Which vectors have this field at all
- Value bitmaps: Per distinct value, which vectors have that value
- Evaluate the filter expression against bitmap indexes
- Produce a candidate
RoaringBitmapof matching vector positions - During cluster scanning, skip any vector not in the candidate set
- Falls back to post-filtering for filters that can't use bitmaps
Built during compaction for FTS-enabled fields. See Full-Text Search for details.
segments/<segment_id>/fts_<field>.bin # ZFTS magic, posting lists per term
Contains:
- Corpus stats (doc count, average doc length)
- Per-term posting lists (doc positions + term frequencies)
Getting Started
API & SDKs
Configuration
Architecture
Operations