Author: Nicolo Ceneda
Contact: n.ceneda20@imperial.ac.uk
Website: nicoloceneda.github.io
Institution: Imperial College London
Course: PhD in Finance
This repository is a collection of ready-to-run machine learning and deep learning examples. Each script focuses on a specific model or method, including implementations from scratch, scikit-learn, and PyTorch workflows.
Create and activate a virtual environment:
conda create -n envML python=3.12 -y
conda activate envMLInstall dependencies from requirements.txt:
pip install --upgrade pip
pip install -r requirements.txtExpand the dropdown windows for details.
01_perceptron.py
- Model: perceptron
- Implementation: manual
- Task: binary classification
- Dataset: Iris
02_perceptron_sl.py
- Model: perceptron
- Implementation: scikit-learn
- Task: multi-class classification
- Dataset: Iris
03_adaline_gd.py
- Model: adaline (gradient descent)
- Implementation: manual
- Task: binary classification
- Dataset: Iris
04_adaline_sgd.py
- Model: adaline (stochastic gradient descent)
- Implementation: manual
- Task: binary classification
- Dataset: Iris
05_logistic_regression.py
- Model: logistic regression
- Implementation: manual
- Task: binary classification
- Dataset: Iris
06_logistic_regression_sl.py
- Model: logistic regression
- Implementation: scikit-learn
- Task: multi-class classification
- Dataset: Iris
07_support_vector_linear_sl.py
- Model: support vector machine
- Implementation: scikit-learn
- Task: multi-class classification
- Dataset: Iris
08_support_vector_kernel_sl.py
- Model: kernel support vector machine
- Implementation: scikit-learn
- Task: multi-class classification
- Dataset: Iris
09_decision_tree_sl.py
- Model: decision tree
- Implementation: scikit-learn
- Task: multi-class classification
- Dataset: Iris
10_random_forest_sl.py
- Model: random forest
- Implementation: scikit-learn
- Task: multi-class classification
- Dataset: Iris
11_k_nearest_sl.py
- Model: k-nearest neighbors
- Implementation: scikit-learn
- Task: multi-class classification
- Dataset: Iris
12_pca.py
- Model: principal components analysis
- Implementation: manual
- Task: dimensionality reduction (unsupervised)
- Dataset: Wine
13_pca_sl.py
- Model: principal components analysis
- Implementation: scikit-learn
- Task: multi-class classification
- Dataset: Wine
14_lda.py
- Model: linear discriminant analysis
- Implementation: manual
- Task: multi-class classification
- Dataset: Wine
15_lda_sl.py
- Model: linear discriminant analysis
- Implementation: scikit-learn
- Task: multi-class classification
- Dataset: Wine
16_tsde_sl.py
- Model: t-distributed stochastic neighbor embedding
- Implementation: scikit-learn
- Task: dimensionality reduction (unsupervised)
- Dataset: Digits
17_pipeline_sl.py
- Model: pipeline
- Implementation: scikit-learn
- Task: binary (or multi-class) classification
- Dataset: Wdbc
18_cross_val_sl.py
- Model: cross-validation (method 1)
- Implementation: scikit-learn
- Task: binary (or multi-class) classification
- Dataset: Wdbc
19_cross_val_sl.py
- Model: cross-validation (method 2)
- Implementation: scikit-learn
- Task: binary (or multi-class) classification
- Dataset: Wdbc
20_learning_curves_sl.py
- Model: learning curves
- Implementation: scikit-learn
- Task: binary (or multi-class) classification
- Dataset: Wdbc
21_validation_curves_sl.py
- Model: validation curves
- Implementation: scikit-learn
- Task: binary (or multi-class) classification
- Dataset: Wdbc
22_grid_search_sl.py
- Model: grid search
- Implementation: scikit-learn
- Task: binary (or multi-class) classification
- Dataset: Wdbc
23_random_search_sl.py
- Model: random search
- Implementation: scikit-learn
- Task: binary (or multi-class) classification
- Dataset: Wdbc
24_halving_random_search_sl.py
- Model: halving random search
- Implementation: scikit-learn
- Task: binary (or multi-class) classification
- Dataset: Wdbc
25_nested_cross_val_sl.py
- Model: nested cross-validation
- Implementation: scikit-learn
- Task: binary (or multi-class) classification
- Dataset: Wdbc
26_majority_vote_classifier.py
- Model: majority vote classifier
- Implementation: manual
- Task: binary classification
- Dataset: Iris
27_bagging_sl.py
- Model: bagging
- Implementation: scikit-learn
- Task: binary classification
- Dataset: Wine
28_ada_boost_sl.py
- Model: adaboost
- Implementation: scikit-learn
- Task: binary classification
- Dataset: Wine
29_xgboost_sl.py
- Model: xgboost
- Implementation: xgboost
- Task: binary classification
- Dataset: Wine
30_sentiment_analysis_sl.py
- Application: sentiment analysis
- Implementation: scikit-learn
- Dataset: Imdb
31_sentiment_analysis_oocl_sl.py
- Application: sentiment analysis (out-of-core learning)
- Implementation: scikit-learn
- Dataset: Imdb
32_latent_dirichlet_alloc_sl.py
- Model: Latent dirichlet allocation
- Implementation: scikit-learn
- Task: topic modeling (unsupervised)
- Dataset: Imdb
33_linear_regression_uni_gd.py
- Model: linear regression (gradient descent)
- Implementation: manual
- Task: univariate regression
- Dataset: Housing
34_linear_regression_uni_sl.py
- Model: linear regression
- Implementation: scikit-learn
- Task: univariate regression
- Dataset: Housing
35_ransac_regression_uni_sl.py
- Model: ransac regression
- Implementation: scikit-learn
- Task: univariate regression
- Dataset: Housing
36_linear_regression_mul_sl.py
- Model: linear regression
- Implementation: scikit-learn
- Task: multivariate regression
- Dataset: Housing
37_poly_regression_uni_sl.py
- Model: polynomial regression
- Implementation: scikit-learn
- Task: univariate regression
- Dataset: Housing
38_decision_tree_regression_sl.py
- Model: decision tree regression
- Implementation: scikit-learn
- Task: univariate regression
- Dataset: Housing
39_random_forest_regression_sl.py
- Model: random forest regression
- Implementation: scikit-learn
- Task: univariate regression
- Dataset: Housing
40_k_means_clustering_sl.py
- Model: k-means clustering with k-means++ initialization
- Implementation: scikit-learn
- Task: clustering
- Dataset: Synthetic
41_hierarchical_clustering.py
- Model: complete linkage agglomerative hierarchical clustering
- Implementation: manual
- Task: clustering
- Dataset: Synthetic
42_hierarchical_clustering_sl.py
- Model: complete linkage agglomerative hierarchical clustering
- Implementation: scikit-learn
- Task: clustering
- Dataset: Synthetic
43_density_clustering_sl.py
- Model: dbscan clustering
- Implementation: scikit-learn
- Task: clustering
- Dataset: Synthetic
44_multilayer_perceptron.py
- Model: multilayer perceptron
- Implementation: manual
- Task: multi-class classification
- Dataset: Mnist
45_pytorch_basics.py
- Learning: PyTorch basics
- Dataset: Cats and Dogs; CelebA; Mnist
46_pytorch_mechanics.py
- Learning: PyTorch mechanics
47_linear_regression_uni_sgd.py
- Model: linear regression (stochastic gradient descent)
- Implementation: manual/pytorch
- Task: univariate regression
- Dataset: Synthetic
48_linear_regression_uni_sgd_pt.py
- Model: linear regression (stochastic gradient descent)
- Implementation: pytorch
- Task: univariate regression
- Dataset: Synthetic
49_multilayer_perceptron_pt.py
- Model: multilayer perceptron (nn.Module)
- Implementation: pytorch
- Task: multi-class classification
- Dataset: Iris
50_multilayer_perceptron_pt.py
- Model: multilayer perceptron (nn.Sequential)
- Implementation: pytorch
- Task: binary classification
- Dataset: Synthetic
51_multilayer_perceptron_pt.py
- Model: multilayer perceptron (nn.Module)
- Implementation: pytorch
- Task: binary classification
- Dataset: Synthetic
52_multilayer_perceptron_pt.py
- Model: multilayer perceptron (nn.Module with custom layer)
- Implementation: pytorch
- Task: binary classification
- Dataset: Synthetic
53_fuel_efficiency_pt.py
- Application: predicting fuel efficiency
- Implementation: pytorch
- Dataset: Auto MPG
54_handwritten_digits_pt.py
- Application: classifying handwritten digits
- Implementation: pytorch
- Dataset: Mnist
55_handwritten_digits_pl.py
- Application: classifying handwritten digits
- Implementation: pytorch lightning
- Dataset: Mnist
Sources
- Machine Learning with PyTorch and Scikit-Learn, Sebastian Raschka
