feat(llm): add API backend support with backend-aware session routing#204
Draft
marcinbogdanski wants to merge 20 commits intotobi:mainfrom
Draft
feat(llm): add API backend support with backend-aware session routing#204marcinbogdanski wants to merge 20 commits intotobi:mainfrom
marcinbogdanski wants to merge 20 commits intotobi:mainfrom
Conversation
eba03a6 to
e16627c
Compare
Author
|
Added a simple guard to protect existing embeddings in the database in case user changes embedding backend/provider. Problem:
How new guard works:
I investigated adding model/provider scoping to the relevant tables, but it would be too intrusive IMHO for this PR. I'm happy to look into this again (to provide proper multi-model support in single index file), if such functionality is required. For now if you want to switch model, you need to re-embed or use separate index. |
97929dc to
05203b3
Compare
…nv vars to QMD_{EMBED|CHAT|RERANK}_*
…bility; keep rerank Cohere-only
…ract/live provider tests
… handling, and make QMD_LLM_BACKEND validation strict
…s use configured API models
…tDefaultLLM for embeddings
05203b3 to
59aa0c2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hello, looks like repo is exploding!
Introduce API backend for LLM support so
qmdcan be run w/o local modelsDraft PR for now, feedback welcome, happy to update to maintainer requirements.
Covered:
/v1/embeddingsendpoint/v1/chat/completions/v1/rerankQMD_LLM_BACKEND="api"env var, current behavior unchanged if unsetNot covered:
via short hash and always use it instead of default
Related work / other PRs
Query Expansion - details
contextandincludeLexicalsupportedError handling
optionsparam) is ignoredoptionsparam) is ignoredHope this is helpful.
PS. got some free time, happy to update to spec or do other work if you want to dump anything on me, now that I am somewhat familiar with the repo