This repository contains the lab exercises for the Information Retrieval lecture at the Dresden University of Technology (TUD) for the Winter term 25/26.
The labs are designed to gradually introduce practical tools and concepts commonly used in Information Retrieval and Natural Language Processing:
- Lab 01 – Intro to pandas: A hands-on introduction to data handling and analysis with pandas. In this, an we explore the basics of loading, cleaning, and manipulating structured data. Participants learn the essential basics to work with text collections and how to approach analaysis.
- Lab 02 – spaCy: Building on the pandas basics, this lab introduces spaCy for natural language processing. Participants apply their data analysis knowledge to real text data, covering tasks such as tokenization, linguistic annotation, and basic text analysis.
- Lab 03 – Intro to PyTorch: An introduction to PyTorch that focuses on the fundamentals of machine learning. This lab provides the groundwork for understanding and implementing simple neural models.
We recommend using uv for this project; nonetheless, we also provide a requirements.txt file.
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Update uv
uv self updateHowever, pip can also be used (we will not provide further setup details; though if you installed uv via pip, you can simply type e.g. python3 -m uv sync and achieve similar results):
pip install uv- First, clone the repository
git clone git@github.com:jembie/ir-exercises.git
cd ir-exercises/- Then install the required dependencies
# Recommended through uv
uv sync
# Alternatively via pip
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtAll relevant Jupyter Notebooks are located in the exercises/ directory. Once the required dependencies are installed, each notebook should be ready to use.
exercises
├── lab01-intro-to-pandas
├── lab02-spaCy
└── lab03-intro-to-pytorch