You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CSV auto-detects delimiter (comma, tab, pipe) and infers column types from the data. JSON supports both `[{...}, {...}]` arrays and NDJSON (`{...}\n{...}`) based on the first non-whitespace byte.
96
+
90
97
These materialize all data in memory. Use Parquet or Lance for large datasets.
Copy file name to clipboardExpand all lines: docs/src/content/docs/vector-search.mdx
+7-6Lines changed: 7 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,9 @@
1
1
---
2
2
title: Vector Search
3
-
description: Similarity search with SIMD acceleration, IVF-PQ indexes, and text-to-vector encoding.
3
+
description: Similarity search with SIMD acceleration, HNSW and IVF-PQ indexes, and text-to-vector encoding.
4
4
---
5
5
6
-
QueryMode supports vector similarity search on embedding columns stored in Lance format. Searches use WASM SIMD for acceleration and IVF-PQ indexes when available.
6
+
QueryMode supports vector similarity search on embedding columns stored in Lance format. Searches use WASM SIMD for acceleration and HNSW or IVF-PQ indexes when available.
7
7
8
8
## DataFrame API
9
9
@@ -60,14 +60,14 @@ The `NEAR` operator performs vector similarity search. `TOPK` limits results to
60
60
61
61
Without an index, QueryMode performs brute-force SIMD-accelerated distance computation across all vectors. Fast for datasets under ~100K vectors.
62
62
63
-
### IVF-PQ
63
+
### IVF-PQ (load pre-built)
64
64
65
-
For larger datasets, create an IVF-PQ (Inverted File with Product Quantization) index:
65
+
IVF-PQ (Inverted File with Product Quantization) indexes can be loaded from R2 for search:
66
66
67
67
-**IVF** partitions vectors into clusters. At query time, only `nprobe` clusters are searched.
68
68
-**PQ** compresses vectors into compact codes, reducing memory and I/O.
69
69
70
-
IVF-PQ indexes are stored alongside data in R2 and loaded on first query.
70
+
IVF-PQ indexes must be built externally (e.g. with LanceDB or FAISS) and stored in R2 alongside the data. QueryMode loads and searches them via the WASM engine. For indexes you can build directly in QueryMode, use HNSW.
0 commit comments