π¦ Module 5: Assembling & Pretraining Our GPT
This module combines all components into a complete GPT model and implements the pretraining process.
Tasks to Complete:
Deliverables:
Key Implementation Files:
Training Components:
Resources:
- GPT-1, GPT-2, GPT-3 papers
- AdamW optimizer paper
- Learning rate scheduling strategies
- WikiText dataset for pretraining
π¦ Module 5: Assembling & Pretraining Our GPT
This module combines all components into a complete GPT model and implements the pretraining process.
Tasks to Complete:
Lesson 5.1 β Stacking Decoder Blocks & Output Head
Lesson 5.2 β Objective: Next Token Prediction & Loss Function
Lesson 5.3 β Optimizer Setup & Learning Rate Scheduler
Lesson 5.4 β Pretraining Loop Pt. 1: Forward + Backward
Lesson 5.5 β Pretraining Loop Pt. 2: Gradient Clipping & Logging
Lesson 5.6 β Running Pretraining & Monitoring Loss Curves
Lesson 5.7 β Inference with Your Trained Model
Deliverables:
Key Implementation Files:
gpt_model.py- Complete GPT architecturetraining_loop.py- Pretraining implementationoptimizer_config.py- Optimizer and scheduler setupinference.py- Text generation and samplingmonitoring.py- Training metrics and loggingTraining Components:
Resources: