Skip to content

Fix multi-collection filtering in search/query/vsearch#218

Open
1kuna wants to merge 1 commit intotobi:mainfrom
1kuna:fix/multi-collection-filter
Open

Fix multi-collection filtering in search/query/vsearch#218
1kuna wants to merge 1 commit intotobi:mainfrom
1kuna:fix/multi-collection-filter

Conversation

@1kuna
Copy link

@1kuna 1kuna commented Feb 18, 2026

Problem

When multiple -c/--collection flags are passed to qmd search, query, or vsearch, results are often empty or incomplete.

Root cause: With a single -c, the collection filter is pushed into the SQL query (AND d.collection = ?). With multiple -c flags, qmd performs a global unfiltered top-K search then post-filters by qmd://{collection}/ prefix. Large collections dominate the global top-K, so post-filtering removes everything from smaller requested collections.

Fix

Push the collection filter into the underlying retrieval for all three search paths instead of post-filtering a limited result set:

  • FTS search: WHERE ... AND d.collection IN (?, ...) via a shared SQL helper
  • Vector search: Precompute eligible hash_seq values for requested collections and apply AND hash_seq IN (...) directly in the sqlite-vec KNN query
  • Hybrid query: Same filter pushed through to both FTS and vector sub-queries

Added normalizeCollectionFilter() and appendCollectionFilterSql() helpers in store.ts to handle single vs. multi-collection SQL generation consistently.

Type changes

  • searchFTS and searchVec (both the Store methods and the exported functions) now accept string | string[] for the collection parameter
  • HybridQueryOptions.collection and VectorSearchOptions.collection updated to string | string[]

Tests

Added 4 test files/scenarios (TDD, written before the fix, confirmed failing, then passing):

  • test/cli.test.ts — end-to-end CLI regression with 80-doc noisy collection vs 2 small target collections
  • test/store.test.tssearchFTS with multi-collection array filter
  • test/query-collection-routing.test.ts — verifies hybridQuery/vectorSearchQuery pass array filters through
  • test/searchvec-multi-collection.test.ts — vector search under top-K domination with collection IN filter

Full test suite passes (5 pre-existing LLM timeout failures unrelated to this change).

@1kuna 1kuna force-pushed the fix/multi-collection-filter branch from 82b6597 to e237ae7 Compare February 18, 2026 21:51
Co-authored-by: Alyx <kunaclawd@gmail.com>
@1kuna 1kuna force-pushed the fix/multi-collection-filter branch from e237ae7 to 78587ec Compare February 18, 2026 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments