-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Our plan of action based on accessibility of resources
Plan A: The "Locally Run" Plan (Best for SFT & Learning)
-
Resource Used: Local MacBook Pro M2 (16GB) + Hugging Face Pro Goal: Master the Data and Post-Training (SFT) phases without spending cloud credits. Limitation: You cannot use Nanotron (CUDA-only). You will use MLX or TRL instead.
-
Context: The M2 chip is capable of training a 135M model, but it lacks the throughput for pre-training from scratch. However, it is excellent for the Supervised Fine-Tuning (SFT) chapter of the playbook.
-
Hardware Setup:
- RAM: 16GB is sufficient for a 135M model (requires <1GB VRAM).
- Framework: MLX (Apple's array framework) or transformers with mps (Metal Performance Shaders) backend.
-
Step-by-Step Execution:
- Base Model: Download the pre-trained
HuggingFaceTB/SmolLM2-135Mfrom Hugging Face Pro (faster download). - Dataset: Download
HuggingFaceTB/smol-smoltalk. - Training (SFT):
- Use MLX-LM:
pip install mlx-lmmlx_lm.lora --model HuggingFaceTB/SmolLM2-135M --train --data data/ --batch-size 4 --iters 1000
- Base Model: Download the pre-trained
-
Note: While the Playbook uses alignment-handbook (PyTorch), MLX is 10x more efficient on your Mac. The curriculum logic remains the same.
-
Evaluation: Run inference locally to test "General Knowledge" vs "Instruction Following".
-
Hosting: Use your Hugging Face Pro account to host the resulting model on a ZeroGPU Space for a free, shareable demo.