Skip to content

Add local exact-scan vector search support (runtime, API, codecs, CLI, docs, tests)#8

Merged
ashione merged 4 commits intomainfrom
codex/python
Mar 30, 2026
Merged

Add local exact-scan vector search support (runtime, API, codecs, CLI, docs, tests)#8
ashione merged 4 commits intomainfrom
codex/python

Conversation

@ashione
Copy link
Copy Markdown
Owner

@ashione ashione commented Mar 30, 2026

Motivation

  • Implement a minimal local-first vector search path for fixed-dimension float vectors with exact-scan execution and common metrics (cosine/dot/l2).
  • Preserve vector semantics and precision across native transport/serialization and expose vector search from C++ and Python APIs and from CLI tools.

Description

  • Introduce DataType::FixedVector, extend Value to store/serialize fixed float vectors and add Value::parseFixedVector and asFixedVector accessors.
  • Add a runtime vector index abstraction (vector_index.h/.cc) and an ExactScanVectorIndex implementation providing search(...) and explain(...) backed by contiguous float buffers and heap top-k selection.
  • Add DataFrame APIs vectorQuery(...) and explainVectorQuery(...) and corresponding DataflowSession methods to expose vector search at the session level.
  • Extend serializers and codecs: proto-like serializer and binary row-batch codec now encode FixedVector as raw float bit payloads and round-trip vector values; Arrow fast-path supports FixedSizeList<float32> columns.
  • Add Python bindings for vector search (Session.vector_search, Session.explain_vector_search), Arrow fast-path parsing, and round-trip conversion of fixed vectors to Python lists.
  • Provide CLIs: a Python-based single-file CLI (python_api/velaria_cli.py) and a native C++ velaria_cli example, plus scripts/build_py_cli_executable.sh to package a one-file executable.
  • Add documentation (docs/local_vector_search_v01.md, README updates) describing scope, API surface, explain output, and usage examples; add example binary vector_search_benchmark and native velaria_cli example.
  • Wire up BUILD files to include new sources, examples, and tests (vector_runtime_test, vector_search_benchmark, Python vector_search_test) and export the build script.

Testing

  • Ran the new C++ runtime test: bazel test //:vector_runtime_test, which exercises proto-like and binary round-trip and C++ vector queries and explain, and it passed.
  • Ran the new Python unit test: bazel test //python_api:vector_search_test, which validates Python API behavior, metrics, explain text and dimension mismatch handling, and it passed.

Codex Task

@ashione ashione marked this pull request as ready for review March 30, 2026 06:31
@ashione ashione merged commit 8c37acc into main Mar 30, 2026
3 checks passed
@ashione ashione deleted the codex/python branch March 30, 2026 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant