Skip to content

nicoloceneda/machine-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning Models

Project cover

Status: Active Development and Maintained

Author: Nicolo Ceneda
Contact: n.ceneda20@imperial.ac.uk
Website: nicoloceneda.github.io
Institution: Imperial College London
Course: PhD in Finance

Description

This repository is a collection of ready-to-run machine learning and deep learning examples. Each script focuses on a specific model or method, including implementations from scratch, scikit-learn, and PyTorch workflows.

Installation

Create and activate a virtual environment:

conda create -n envML python=3.12 -y
conda activate envML

Install dependencies from requirements.txt:

pip install --upgrade pip
pip install -r requirements.txt

Index

Expand the dropdown windows for details.

Foundations

01_perceptron.py
  • Model: perceptron
  • Implementation: manual
  • Task: binary classification
  • Dataset: Iris
02_perceptron_sl.py
  • Model: perceptron
  • Implementation: scikit-learn
  • Task: multi-class classification
  • Dataset: Iris
03_adaline_gd.py
  • Model: adaline (gradient descent)
  • Implementation: manual
  • Task: binary classification
  • Dataset: Iris
04_adaline_sgd.py
  • Model: adaline (stochastic gradient descent)
  • Implementation: manual
  • Task: binary classification
  • Dataset: Iris
05_logistic_regression.py
  • Model: logistic regression
  • Implementation: manual
  • Task: binary classification
  • Dataset: Iris
06_logistic_regression_sl.py
  • Model: logistic regression
  • Implementation: scikit-learn
  • Task: multi-class classification
  • Dataset: Iris
07_support_vector_linear_sl.py
  • Model: support vector machine
  • Implementation: scikit-learn
  • Task: multi-class classification
  • Dataset: Iris
08_support_vector_kernel_sl.py
  • Model: kernel support vector machine
  • Implementation: scikit-learn
  • Task: multi-class classification
  • Dataset: Iris
09_decision_tree_sl.py
  • Model: decision tree
  • Implementation: scikit-learn
  • Task: multi-class classification
  • Dataset: Iris
10_random_forest_sl.py
  • Model: random forest
  • Implementation: scikit-learn
  • Task: multi-class classification
  • Dataset: Iris
11_k_nearest_sl.py
  • Model: k-nearest neighbors
  • Implementation: scikit-learn
  • Task: multi-class classification
  • Dataset: Iris

Dimensionality Reduction

12_pca.py
  • Model: principal components analysis
  • Implementation: manual
  • Task: dimensionality reduction (unsupervised)
  • Dataset: Wine
13_pca_sl.py
  • Model: principal components analysis
  • Implementation: scikit-learn
  • Task: multi-class classification
  • Dataset: Wine
14_lda.py
  • Model: linear discriminant analysis
  • Implementation: manual
  • Task: multi-class classification
  • Dataset: Wine
15_lda_sl.py
  • Model: linear discriminant analysis
  • Implementation: scikit-learn
  • Task: multi-class classification
  • Dataset: Wine
16_tsde_sl.py
  • Model: t-distributed stochastic neighbor embedding
  • Implementation: scikit-learn
  • Task: dimensionality reduction (unsupervised)
  • Dataset: Digits

Pipelines, Validation & Hyperparameter Search

17_pipeline_sl.py
  • Model: pipeline
  • Implementation: scikit-learn
  • Task: binary (or multi-class) classification
  • Dataset: Wdbc
18_cross_val_sl.py
  • Model: cross-validation (method 1)
  • Implementation: scikit-learn
  • Task: binary (or multi-class) classification
  • Dataset: Wdbc
19_cross_val_sl.py
  • Model: cross-validation (method 2)
  • Implementation: scikit-learn
  • Task: binary (or multi-class) classification
  • Dataset: Wdbc
20_learning_curves_sl.py
  • Model: learning curves
  • Implementation: scikit-learn
  • Task: binary (or multi-class) classification
  • Dataset: Wdbc
21_validation_curves_sl.py
  • Model: validation curves
  • Implementation: scikit-learn
  • Task: binary (or multi-class) classification
  • Dataset: Wdbc
22_grid_search_sl.py
  • Model: grid search
  • Implementation: scikit-learn
  • Task: binary (or multi-class) classification
  • Dataset: Wdbc
23_random_search_sl.py
  • Model: random search
  • Implementation: scikit-learn
  • Task: binary (or multi-class) classification
  • Dataset: Wdbc
24_halving_random_search_sl.py
  • Model: halving random search
  • Implementation: scikit-learn
  • Task: binary (or multi-class) classification
  • Dataset: Wdbc
25_nested_cross_val_sl.py
  • Model: nested cross-validation
  • Implementation: scikit-learn
  • Task: binary (or multi-class) classification
  • Dataset: Wdbc

Ensemble Methods

26_majority_vote_classifier.py
  • Model: majority vote classifier
  • Implementation: manual
  • Task: binary classification
  • Dataset: Iris
27_bagging_sl.py
  • Model: bagging
  • Implementation: scikit-learn
  • Task: binary classification
  • Dataset: Wine
28_ada_boost_sl.py
  • Model: adaboost
  • Implementation: scikit-learn
  • Task: binary classification
  • Dataset: Wine
29_xgboost_sl.py
  • Model: xgboost
  • Implementation: xgboost
  • Task: binary classification
  • Dataset: Wine

NLP & Topic Modeling

30_sentiment_analysis_sl.py
  • Application: sentiment analysis
  • Implementation: scikit-learn
  • Dataset: Imdb
31_sentiment_analysis_oocl_sl.py
  • Application: sentiment analysis (out-of-core learning)
  • Implementation: scikit-learn
  • Dataset: Imdb
32_latent_dirichlet_alloc_sl.py
  • Model: Latent dirichlet allocation
  • Implementation: scikit-learn
  • Task: topic modeling (unsupervised)
  • Dataset: Imdb

Regression Models

33_linear_regression_uni_gd.py
  • Model: linear regression (gradient descent)
  • Implementation: manual
  • Task: univariate regression
  • Dataset: Housing
34_linear_regression_uni_sl.py
  • Model: linear regression
  • Implementation: scikit-learn
  • Task: univariate regression
  • Dataset: Housing
35_ransac_regression_uni_sl.py
  • Model: ransac regression
  • Implementation: scikit-learn
  • Task: univariate regression
  • Dataset: Housing
36_linear_regression_mul_sl.py
  • Model: linear regression
  • Implementation: scikit-learn
  • Task: multivariate regression
  • Dataset: Housing
37_poly_regression_uni_sl.py
  • Model: polynomial regression
  • Implementation: scikit-learn
  • Task: univariate regression
  • Dataset: Housing
38_decision_tree_regression_sl.py
  • Model: decision tree regression
  • Implementation: scikit-learn
  • Task: univariate regression
  • Dataset: Housing
39_random_forest_regression_sl.py
  • Model: random forest regression
  • Implementation: scikit-learn
  • Task: univariate regression
  • Dataset: Housing

Clustering

40_k_means_clustering_sl.py
  • Model: k-means clustering with k-means++ initialization
  • Implementation: scikit-learn
  • Task: clustering
  • Dataset: Synthetic
41_hierarchical_clustering.py
  • Model: complete linkage agglomerative hierarchical clustering
  • Implementation: manual
  • Task: clustering
  • Dataset: Synthetic
42_hierarchical_clustering_sl.py
  • Model: complete linkage agglomerative hierarchical clustering
  • Implementation: scikit-learn
  • Task: clustering
  • Dataset: Synthetic
43_density_clustering_sl.py
  • Model: dbscan clustering
  • Implementation: scikit-learn
  • Task: clustering
  • Dataset: Synthetic

Neural Networks

44_multilayer_perceptron.py
  • Model: multilayer perceptron
  • Implementation: manual
  • Task: multi-class classification
  • Dataset: Mnist
45_pytorch_basics.py
  • Learning: PyTorch basics
  • Dataset: Cats and Dogs; CelebA; Mnist
46_pytorch_mechanics.py
  • Learning: PyTorch mechanics
47_linear_regression_uni_sgd.py
  • Model: linear regression (stochastic gradient descent)
  • Implementation: manual/pytorch
  • Task: univariate regression
  • Dataset: Synthetic
48_linear_regression_uni_sgd_pt.py
  • Model: linear regression (stochastic gradient descent)
  • Implementation: pytorch
  • Task: univariate regression
  • Dataset: Synthetic
49_multilayer_perceptron_pt.py
  • Model: multilayer perceptron (nn.Module)
  • Implementation: pytorch
  • Task: multi-class classification
  • Dataset: Iris
50_multilayer_perceptron_pt.py
  • Model: multilayer perceptron (nn.Sequential)
  • Implementation: pytorch
  • Task: binary classification
  • Dataset: Synthetic
51_multilayer_perceptron_pt.py
  • Model: multilayer perceptron (nn.Module)
  • Implementation: pytorch
  • Task: binary classification
  • Dataset: Synthetic
52_multilayer_perceptron_pt.py
  • Model: multilayer perceptron (nn.Module with custom layer)
  • Implementation: pytorch
  • Task: binary classification
  • Dataset: Synthetic
53_fuel_efficiency_pt.py
  • Application: predicting fuel efficiency
  • Implementation: pytorch
  • Dataset: Auto MPG
54_handwritten_digits_pt.py
  • Application: classifying handwritten digits
  • Implementation: pytorch
  • Dataset: Mnist
55_handwritten_digits_pl.py
  • Application: classifying handwritten digits
  • Implementation: pytorch lightning
  • Dataset: Mnist

Sources