Skip to content

feat(llm): add API backend support with backend-aware session routing#204

Draft
marcinbogdanski wants to merge 20 commits intotobi:mainfrom
marcinbogdanski:feat/api-llm-provider-backends
Draft

feat(llm): add API backend support with backend-aware session routing#204
marcinbogdanski wants to merge 20 commits intotobi:mainfrom
marcinbogdanski:feat/api-llm-provider-backends

Conversation

@marcinbogdanski
Copy link

@marcinbogdanski marcinbogdanski commented Feb 17, 2026

Hello, looks like repo is exploding!

Introduce API backend for LLM support so qmd can be run w/o local models

Draft PR for now, feedback welcome, happy to update to maintainer requirements.

Covered:

  • API backend covering:
    • embeddings via /v1/embeddings endpoint
    • query expansion via /v1/chat/completions
    • rerank via /v1/rerank
  • backend selection via QMD_LLM_BACKEND="api" env var, current behavior unchanged if unset
  • search/vsearch/query session paths update to be backend aware (avoids Llama init/download)
  • token-based chunking: in API mode skips tokenization, uses char-based chunking
  • tests for new API behavior - require provider api keys or skipped
  • some backward compatibility tests

Not covered:

  • support per embedding backend/model isolation in index - this is likely required to allow API api provider switching?
  • I have separate fix in the works on my end to derive index name from model provider/name
    via short hash and always use it instead of default

Related work / other PRs

Query Expansion - details

  • currently simply prompts completions endpoint to return lex/vec/hyde lines
  • it does not use constrained generation
  • if result parser returns empty, it will return empty array
  • params context and includeLexical supported

Error handling

  • query
    • API key missing -> throw
    • network/http/request failure -> throw
    • invalid result shape -> throw
    • valid result but no parseable lex|vec|hyde lines -> return empty array
  • Embeddings
    • model override (options param) is ignored
    • missing API key -> throws
    • network/http/request failure -> returns null (consistent with local path)
  • rerank
    • model override (options param) is ignored
    • missing API key -> throws
    • network/http/request failure -> throws

Hope this is helpful.

PS. got some free time, happy to update to spec or do other work if you want to dump anything on me, now that I am somewhat familiar with the repo

@marcinbogdanski
Copy link
Author

Added a simple guard to protect existing embeddings in the database in case user changes embedding backend/provider.

Problem:

  • user starts using qmd via API with model A -> db is populated with embeddings from model A
  • user changes API embedding model to model B (or switches back to local model), old model A embeddings still exist in the db
  • now new embeddings come from new model B, but database still has old incompatible embeddings from model A

How new guard works:

  • if qmd is used only with built-in local models (API never enabled) -> nothing changes, I didn't touch that path
  • at any point, if embeddinsg are generated with API model, that information is written to small new table in the database
  • on all future embed/vsearch/query commands, qmd first checks if current model/provider setting matches what's saved in the database. If not, an error message is shown to user, to either:
    • A) change env var config to match original embeddings and re-run the command
    • B) use different index with --index parameter
    • C) manually run qmd embed -f to force clear old embeddings and rebuild with new model

I investigated adding model/provider scoping to the relevant tables, but it would be too intrusive IMHO for this PR. I'm happy to look into this again (to provide proper multi-model support in single index file), if such functionality is required.

For now if you want to switch model, you need to re-embed or use separate index.

@marcinbogdanski marcinbogdanski force-pushed the feat/api-llm-provider-backends branch from 97929dc to 05203b3 Compare February 19, 2026 12:02
marcinbogdanski and others added 20 commits February 19, 2026 14:33
… handling, and make QMD_LLM_BACKEND validation strict
@marcinbogdanski marcinbogdanski force-pushed the feat/api-llm-provider-backends branch from 05203b3 to 59aa0c2 Compare February 19, 2026 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments