Repository for the course with all material.
The slides contain additional background and theroretical information.
If possible, work with uv. Clone the repository and run uv sync.
However, there are some challenges:
- Most of the notebooks work with
pyproject.toml. - Currently, the lastest
transformersis not compatible with bothvllmandunsloth. I recommend using a different kernel for that. - Some notebooks are specifically suited for MacOS using the
mlx-lmpackage. It is only useful to install that with a Mac.
Create an venv or conda environment and install the following packages for the normal notebooks:
- accelerate
- datasets
- flash-attn
- ipython
- jupyter
- kernels
- liger-kernel
- peft
- transformers
- triton
- trl
flash-attn should be installed with the option --no-build-isolation.
For unsloth and vllm, you can use:
- datasets
- ipykernel
- jupyter
- transformers
- trl
- unsloth
- vllm
For MacOS notebooks, the following packages are recommended:
- jupyter
- mlx-lm
I have not provided a requirements.txt as dependencies tend to get outdated faster that I can update.
You can also use runpod. uv is already preinstalled there. Of course, the MacOS notebooks won't work there.
You can either try to run the notebooks directly or try to follow how I run them and use it as a documentation (or run it later).
- 11-deepseek-distill-qwen3-8.ipynb: Use a distilled DeepSeek model
- 12-qwen3-8.ipynb: Use a hybrid Qwen3 model with 8B parameters
- 13-nanbeige-3.ipynb: Use a new, very small and powerful model
- 14-nanbeige-3-tool.ipynb: Same as before, but with reasoning (e.g. agentic usage)
- 15-mimo-7.ipynb: Use MiMo model from Xiaomi
- 16-gpt-oss-20.ipynb: Use GPT-OSS from OpenAI with efficient MXFP4 datatype
- 17-qwen3-0.6.ipynb: Use a small Qwen3 model
- 21-qwen3-8-awq-vllm.ipynb: Use a quantized model with vLLM for optimized generation
- 31-mlx-deepseek-qwen3-8b.ipynb: Run the Qwen3 8B model on MacOS
- 32-mlx-qwen-30-3.ipynb: Run a larger Qwen3 30B model with 3B active parameters on MacOS
- 33-openrouter.ipynb: Use OpenRouter to run other reasoning models
- 41-finetune-numinamath-grpo-trl-qwen.ipynb: Use Hugging Face
trlto train a model - 41-finetune-numinamath-grpo-trl-qwen-complete.ipynb: Same as before, but with output
- 42-unsloth-qwen3-4-base.ipynb: Use a combination of
unslothandvllmto train a model - 42-unsloth-qwen3-4-base-complete.ipynb: Same as before, but with output