Smol training replication with nano llm models from scratch

Our plan of action based on accessibility of resources 

### Plan A: The "Locally Run" Plan (Best for SFT & Learning)

- **Resource Used:** Local MacBook Pro M2 (16GB) + Hugging Face Pro Goal: Master the Data and Post-Training (SFT) phases without spending cloud credits. Limitation: You cannot use Nanotron (CUDA-only). You will use MLX or TRL instead.
- **Context**: The M2 chip is capable of training a 135M model, but it lacks the throughput for pre-training from scratch. However, it is excellent for the Supervised Fine-Tuning (SFT) chapter of the playbook.
- **Hardware Setup**:
    - **RAM**: 16GB is sufficient for a 135M model (requires <1GB VRAM).
    - **Framework**: MLX (Apple's array framework) or transformers with mps (Metal Performance Shaders) backend.
- **Step-by-Step Execution:**
    - Base Model: Download the pre-trained `HuggingFaceTB/SmolLM2-135M` from Hugging Face Pro (faster download).
    - Dataset: Download `HuggingFaceTB/smol-smoltalk.`
    - Training (SFT):
    - Use MLX-LM:
    - `pip install mlx-lm`
    - `mlx_lm.lora --model HuggingFaceTB/SmolLM2-135M --train --data data/ --batch-size 4 --iters 1000`
    
- **Note**: While the Playbook uses alignment-handbook (PyTorch), MLX is 10x more efficient on your Mac. The curriculum logic remains the same.
- **Evaluation**: Run inference locally to test "General Knowledge" vs "Instruction Following".
- **Hosting**: Use your Hugging Face Pro account to host the resulting model on a ZeroGPU Space for a free, shareable demo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Smol training replication with nano llm models from scratch #34

Plan A: The "Locally Run" Plan (Best for SFT & Learning)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Smol training replication with nano llm models from scratch #34

Description

Plan A: The "Locally Run" Plan (Best for SFT & Learning)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions