feat: Apple Silicon (MLX) support for query expansion fine-tuning#77
Open
feat: Apple Silicon (MLX) support for query expansion fine-tuning#77
Conversation
a5ab100 to
4c76472
Compare
Owner
|
that is pretty damn cool |
Contributor
Author
|
Thank you 😊 !! Just re-verified the training after syncing reward.py with upstream. 3000 iters, ~15 min on M3 Max, final loss 0.13. For the open questions: I matched your config roughly (Qwen3-1.7B, 3000 iters, 2e-4 LR, LoRA rank 16 and batch size) and kept finetune-mlx/ alongside finetune/. Happy to publish the GGUF weights if useful. I'm also happy to keep this as a separate repo or maintain it here if needed |
4c76472 to
dfbe35b
Compare
Port of the query expansion fine-tuning pipeline to Apple Silicon using MLX. - SFT training with LoRA on Qwen3-1.7B - GRPO training (reinforcement learning refinement) - Full GGUF export pipeline (MLX -> GGUF -> Ollama) - Evaluation harness with reward scoring - 100% local - no cloud GPU needed, works on M1/M2/M3/M4 Includes: finetune-mlx/ directory with training scripts, configs, evaluation tools, and scripts/mlx_expand.py standalone sidecar. Runtime integration (src/llm.ts sidecar) omitted - upstream removed MLX sidecar code in v1.0.8 refactor. Training pipeline is independent. Co-authored-by: sujito00 <sujito00@users.noreply.github.com> Co-authored-by: David Gil <dgilperez@gmail.com>
bf6e46b to
d314c5e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Port of the query expansion fine-tuning pipeline to Apple Silicon using MLX.
Observation
When testing, we noticed the current published model outputs placeholder text for hyde (see #75):
Our retrained model (same dataset, same base model) produces actual contextual content:
Questions before proceeding
We wanted to raise these questions rather than assume:
finetune-mlx/live alongsidefinetune/in this repo, or would you prefer a separate repository?What's included
Training artifacts (adapters/, models/, exports/) are gitignored.
Contributors