Parameter-efficient enhancement for Mistral-3-14B-Reasoning
Adds 6 domain experts (~12M params) that dynamically route to improve:
- Advanced Math
- Formal Logic
- Algorithm Design
- Scientific Reasoning
- Multi-step Planning
- Abstract/Symbolic reasoning
export GITHUB_REPO="https://github.com/YOUR_USERNAME/srde-mistral"
export HF_TOKEN="hf_your_token"
export EXAMPLES_PER_DOMAIN=50000 # 300K total
export MAX_STEPS=10000Use vastai_startup.sh as your startup script. It will:
- Clone this repo
- Download datasets (GSM8K, MATH, CodeContests, etc.)
- Pre-tokenize for Mistral
- Train with Flash Attention + Muon optimizer
| Metric | Value |
|---|---|
| Data | 300K examples |
| Time | ~12 hours |
| Cost | ~£28 (1× H200) |
# Install
pip install -r requirements.txt
# Build dataset
python build_and_upload_dataset.py \
--repo_name YOUR_USER/srde-dataset \
--examples_per_domain 50000
# Train
python train.py \
--pretokenized_dir ./data \
--flash_attention \
--use_muon \
--compile| Parameter | Default | Description |
|---|---|---|
num_experts |
6 | Domain expert count |
top_k |
2 | Experts per token |
target_sparsity |
1% | Final delta sparsity |
max_steps |
10000 | Training steps |
| File | Purpose |
|---|---|
train.py |
Main training script |
srde.py |
Core architecture |
config.py |
Configuration |
build_and_upload_dataset.py |
Dataset pipeline |
vastai_startup.sh |
Cloud startup script |
muon.py |
Muon optimizer |
| ID | Domain | Datasets |
|---|---|---|
| 0 | Math | GSM8K, MATH, MetaMathQA |
| 1 | Logic | LogiQA, ReClor |
| 2 | Code | CodeContests, APPS |
| 3 | Science | SciQ, ARC |
| 4 | Planning | StrategyQA, HotpotQA |
| 5 | Abstract | BIG-Bench Hard, AQuA-RAT |
Apache License 2.0