Repository for the course with all material.
The slides contain additional background and theroretical information.
If possible, work with uv. Clone the repository and run uv sync.
Create an venv or conda environment and install the following packages:
- ipykernel
- ipython
- ipywidgets
- jupyter
- tqdm
- transformers
- sentence-transformers
- bitsandbytes
- datasets
- flash-attn
- liger-kernel
- peft
- trl
- unsloth
flash-attn should be installed with the option --no-build-isolation.
Of course, you can also Use the supplied requirements.txt, but some dependencies might be outdated.
You can also use runpod. uv is already preinstalled there.
You can either try to run the notebooks directly or try to follow how I run them and use it as a documentation (or run it later).
- 10-prepare-dataset-finetune.ipynb: Prepares the dataset for finetuning the classification model (uses data from Amazon reviews)
- 11-bert-finetune-classification.ipynb: Finetune a BERT-like model for classificaition
- 12-alternative-zeroshot.ipynb: Alternative approach using a zerosho (NLI) classification model
- 21-sbert-finetune.ipynb: Finetune a sentence BERT (similiarity) model
- 22-create-sbert-data-qwen-reranker.ipynb: Optimize the dataset used for finetuning by using a reranker
- 23-sbert-finetune-qwen-reranker.ipynb: Finetune the similarity model again using the optimized dataset
- 31a-qwen3-07-full-finetune.ipynb: Training notebook
- 31b-qwen3-07.ipynb: Companion notebook for evaluation
- 32a-llama32-1-huggingface.ipynb: Training notebook
- 32b-llama32-1.ipynb: Companion notebook for evaluation