Add QLoRA training script for GSM8K fine-tuningAdd train script#11
Open
noor05-creator wants to merge 3 commits intoAshChadha-iitg:mainfrom
Open
Add QLoRA training script for GSM8K fine-tuningAdd train script#11noor05-creator wants to merge 3 commits intoAshChadha-iitg:mainfrom
noor05-creator wants to merge 3 commits intoAshChadha-iitg:mainfrom
Conversation
- Implement complete training pipeline with 4-bit quantization - Support QLoRA fine-tuning on Qwen2.5-Math-1.5B model - Add 13 configurable CLI arguments for flexibility - Match adapter_config.json hyperparameters (rank=16, alpha=32) - Include loss masking to train only on answer portions - Optimize memory usage for free Colab T4 GPU (~11GB VRAM) - Use BitsAndBytesConfig for NF4 quantization - Implement gradient checkpointing and paged optimizer - Enable reproducibility of 41% GSM8K accuracy results
- Add Training section with setup and usage guide - Document all CLI arguments in table format - Include installation instructions for dependencies - Provide example commands for different training scenarios - Specify hardware requirements (12GB+ VRAM, T4 tested) - Add training time estimates for different configurations - Enable users to reproduce and extend the original results
Owner
|
@noor05-creator Thanks for the detailed contribution. Before merging, I need to ensure reproducibility with the original OpenMath results (41% on 100-question GSM8K subset). Right now, the training pipeline differs from the original in a few key ways:
Could you please:
Once these are aligned, I’m happy to merge. Thanks again for your work! |
- Switch prompt format to Instruction / Problem / Solution - Mask loss for all tokens before "### Solution:" to match original training - Save LoRA adapters only (adapter_model + adapter_config) for repo compatibility
Author
|
@AshChadha-iitg I have done the asked changes.Please Review |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a complete training script for fine-tuning Qwen2.5-Math-1.5B on GSM8K using QLoRA (4-bit quantization).
Closes #3
Changes
train.pywith QLoRA 4-bit quantizationadapter_config.jsonFeatures
Configuration Match
All hyperparameters match existing
adapter_config.json:r): 16 ✅["q_proj", "k_proj", "v_proj", "o_proj"]✅CAUSAL_LM✅Usage
Testing
./checkpoints/Training Performance
Implementation Details
BitsAndBytesConfigfor NF4 quantizationThis makes the project fully reproducible as requested in the issue. Users can now train their own models and experiment with different configurations.
Contributing to OScG'26
This contribution is part of OScG'26