feat(llm): support QMD_GEN_MODEL env var for query expansion model override#226
Open
OmerFarukOruc wants to merge 1 commit intotobi:mainfrom
Open
feat(llm): support QMD_GEN_MODEL env var for query expansion model override#226OmerFarukOruc wants to merge 1 commit intotobi:mainfrom
OmerFarukOruc wants to merge 1 commit intotobi:mainfrom
Conversation
Author
|
For reference — I fine-tuned LFM2-1.2B using the
Trained on |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add environment variable support (
QMD_GEN_MODEL) to override the query expansion model at runtime without code changes.One-line change in the
LlamaCppconstructor (src/llm.ts):Motivation
v1.0.7 added LiquidAI LFM2-1.2B as an alternative base model for query expansion fine-tuning, and exports
LFM2_GENERATE_MODEL/LFM2_INSTRUCT_MODELconstants fromsrc/llm.ts. Thefinetune/directory provides a complete SFT pipeline withconfigs/sft_lfm2.yamlandjobs/sft_lfm2.py.However, there's currently no way for users to actually use a different model —
generateModelUrionly reads fromLlamaCppConfigor falls back to the hardcodedDEFAULT_GENERATE_MODEL(Qwen3-1.7B). Users who fine-tune their own model have no way to point qmd at the result without modifying source code.Precedence
QMD_GEN_MODELenv var →config.generateModel→DEFAULT_GENERATE_MODELThe env var is checked first so users can override without touching config, but programmatic
config.generateModelstill works as a fallback. WhenQMD_GEN_MODELis not set, behavior is identical to before this change.Usage
Fine-tuned Models (free to use)
I fine-tuned LFM2-1.2B using this repo's
finetune/pipeline and published the results (Apache 2.0):Trained on
tobil/qmd-query-expansion-train(5,157 examples), 5 epochs SFT with LoRA rank 16.Testing
Verified locally by patching
~/.bun/install/global/node_modules/qmd/src/llm.ts:QMD_GEN_MODELset to a HuggingFace GGUF URI → downloads and uses the specified model ✅~/.cache/qmd/models/with correct filename (hf_OrcsRise_qmd-query-expansion-lfm2-q8_0.gguf) ✅Before (base LFM2, no fine-tuning — 33 repetitive queries):
After (fine-tuned LFM2 via QMD_GEN_MODEL — 5 focused queries):
Context
LFM2_GENERATE_MODELandLFM2_INSTRUCT_MODELconstants already exported fromsrc/llm.tsfinetune/configs/sft_lfm2.yamlandfinetune/jobs/sft_lfm2.pyprovide the training pipeline