This repository contains a collection of hands-on assignments completed as part of the Machine Learning course at Saint Petersburg State University (SPbU).
The goal of the course was to develop a solid understanding of ML algorithms through implementation from scratch and practical use of real-world tools. Each task involved working with real or synthetic datasets and focused on solving core ML problems such as regression, classification, clustering, dimensionality reduction, and text analysis.
Assignments combine algorithmic work with data preprocessing, visualization, metric analysis, and application of modern Python tools such as scikit-learn, CatBoost, cvxopt, nltk, NumPy, and others.
| # | Title | Summary | Links |
|---|---|---|---|
| 2 | KD-Tree | Manual implementation of KD-Tree for fast k-NN search in multidimensional space. | 📁 homework_2 · PR #2 |
| 3 | Linear Regression | Predict house prices from the Kaggle dataset. Includes pipelines, preprocessing, regularization (L1/L2), feature and target transformations. | 📁 homework_3 |
| 4 | Gradient Descent | Custom implementation of gradient descent for linear regression with different loss functions (MSE, MAE). Convergence analysis. | 📁 homework_4 |
| 5 | Support Vector Machine | Solving SVM using cvxopt: both linear and kernelized versions. Tested on synthetic data, kernel influence analysis. |
📁 homework_5 |
| 6 | Ensembles: Random Forest & CatBoost | Random Forest implementation + CatBoost usage on real VK social network data. Task: predict user gender and age. | 📁 homework_6 |
| 7 | Clustering | Manual implementation of KMeans, DBScan, Agglomerative Clustering. Includes image color quantization with clustering. | 📁 homework_7 |
| 8 | Text Classification | Spam classifier using Bag of Words, TF-IDF, Snowball stemmer, Naive Bayes. End-to-end NLP pipeline. | 📁 homework_8 |
-
Clone the repository:
git clone https://github.com/irinszn/SPbU_ML.git
-
Navigate to the homework folder you want to explore:
cd SPbU_ML/src/homeworks/homework_3 -
Launch the Jupyter Notebook interface:
jupyter notebook
-
Open the corresponding
.ipynbfile (e.g.,linreg.ipynb) in your browser and run the code interactively.