- Session #1: Thursday, October 2, 9:30 AM - 11:00 AM @ Room M1 (CEMFI)
- Session #2: Tuesday, October 27, 15:00 PM - 16:30 PM @ Room M1 (CEMFI)
- Session #3: Thursday, November 19, 9:30 AM - 11:00 AM @ Room M1 (CEMFI)
Please bring your laptop to class with VSCode and uv installed and ready to run Python code.
.
├── data/ # Data files for exercises and examples
│ ├── all_ECB_speeches.csv
│ ├── death-rate-who.csv
│ ├── gdp-per-capita-worldbank.csv
│ └── population-unwpp.csv
├── Session1/ # Session 1 notebooks
│ ├── 1_1.ipynb # Introduction to Python (Part I)
│ └── 1_2.ipynb # Introduction to Python (Part II)
├── Session2/ # Session 2 materials
│ ├── 2_1_Scraping_Intro.ipynb # Web scraping introduction
│ ├── 2_2_BIS_Scraper_I.ipynb # BIS scraper (Part I)
│ ├── 2_3_BIS_Scraper_II.ipynb # BIS scraper (Part II)
│ ├── 2_4_Forecasting_Professions.ipynb # Forecasting professions exercise
│ ├── Extra_BCU_Scraper.ipynb # BCU scraper (extra)
├── Session3/ # Session 3 materials
│ ├── 3_1_LLM_Intro.ipynb # Introduction to Large Language Models
│ ├── 3_2_LLM_Fine_Tuning.ipynb # Fine-tuning LLMs
│ ├── 3_3_LLM_Function_Calling.ipynb # LLM function calling
├── Installation_Guide_VSCode_uv.pdf # Installation guide
├── main.py # Main Python script
├── pyproject.toml # Project dependencies (uv)
├── uv.lock # Locked dependencies for reproducibility
└── README.md # This file- Overview of Python and setup with UV package manager
- Jupyter Notebooks fundamentals
- Python syntax basics:
- Variables and naming conventions
- Primitive data types (int, float, string, bool, None)
- Container data types (tuples, lists, dictionaries)
- Functions and control flow (if/elif/else, for/while loops)
- Introduction to NumPy:
- Creating and managing arrays
- Array operations (element-wise and matrix multiplication)
- Indexing and slicing
- Introduction to Pandas:
- Series: creation, operations, and slicing
- DataFrames: creation, indexing, and operations
- Data manipulation (adding/removing columns, sorting, merging)
- Applying functions
- Data visualization:
- Matplotlib basics
- Seaborn for statistical graphics
- Creating plots (scatter, line, bar, histograms, heatmaps)
- Introduction to web scraping:
- Definition and use cases
- Ethical considerations and legal aspects
- HTTP status codes and requests
- Tools and libraries:
- Using
requestsfor HTTP requests - Parsing HTML with
BeautifulSoup - Basic scraping patterns
- Using
- Understanding website structure and dynamic content
- Handling pagination in web scraping
- Building a scraper for BIS central bank speeches:
- Extracting speech metadata
- Downloading PDF documents
- Organizing scraped data
- Advanced scraping techniques:
- Using Selenium for dynamic web pages
- PDF text extraction with PyPDF2
- Handling file downloads and organization
- Setting up project structure:
- Directory management
- Data organization and storage
- Combining web scraping with machine learning:
- Image retrieval and processing
- Profession classification using ML models
- Forecasting applications
- Additional practice with web scraping
- Scraping from Banco Central del Uruguay (BCU) website
- Fundamentals of LLMs:
- What are Large Language Models?
- Transformer architecture overview
- Pre-trained models and their applications
- Working with LLMs:
- Using the
transformerslibrary - Text generation and completion
- Model visualization with
bertviz
- Using the
- Fine-tuning techniques:
- Parameter-Efficient Fine-Tuning (PEFT)
- LoRA (Low-Rank Adaptation) method
- Training custom models on domain-specific data
- Practical implementation:
- Preparing datasets for fine-tuning
- Training configuration and hyperparameters
- Model evaluation and checkpointing
- Applications:
- Text classification tasks
- Domain adaptation for economic/financial texts
- Function calling and structured output:
- Defining tool schemas for LLM function calls
- Implementing function calling patterns
- Validating and executing LLM-requested functions
- Practical applications:
- News → firm-level shocks: extracting affected firms and classifying shock types
- Central bank speeches → policy stance classification and economic indicator extraction
- Structured data extraction from unstructured text
- Python 3.11+
- VSCode
- UV package manager installed with Python
- Clone this repository
- Navigate to the project directory
- Install dependencies:
uv sync - Open notebooks in VSCode and select the UV environment