STAT-562: Machine Learning

Author: Prashanna Raj Pandit

This repo contains all four major labs completed as part of the course “Machine Learning. Each lab focuses on a different machine learning technique, giving hands-on experience with supervised and unsupervised learning using R. STAT-562 is centered on applying classical statistical learning methods to real datasets. Across the four labs, we explore:

🔹 Data preprocessing & feature engineering
🔹 Classification models (k-NN, LDA, QDA, Naive Bayes)
🔹 Unsupervised learning (hierarchical & K-means clustering)
🔹 Ensemble methods (Bagging, Boosting, Random Forest)
🔹 Model evaluation using accuracy, ROC, confusion matrices, RMSE
🔹 Cross-validation & hyperparameter tuning using caret

Each project builds practical intuition and technical skills for applying statistical models to real-world data.

Final Project (Breast Cancer Classification):

This project builds and evaluates multiple machine learning models to predict breast cancer (Cancer vs Control) using routine blood-based metabolic biomarkers and anthropometric measures instead of imaging or genetic tests.

Models compared:

Naive Bayes
Linear Discriminant Analysis (LDA)
k-NN (with tuned k)
Random Forest
Gradient Boosting
Support Vector Machine (SVM)
Deep Neural Network (DNN)

Model Performance on Breast cancer Classification.

Model	Accuracy	Sensitivity	Specificity	F1 Score	AUC	TP	TN	FP	FN
Naive Bayes	0.70	0.77	0.60	0.74	0.70	10	6	4	3
LDA	0.78	0.85	0.70	0.82	0.80	11	7	3	2
KNN (k tuned)	0.78	0.77	0.80	0.80	0.81	10	8	2	3
Random Forest	0.87	0.85	0.90	0.88	0.91	11	9	1	2
Gradient Boosting	0.83	0.77	0.90	0.83	0.89	10	9	1	3
SVM	0.78	0.69	0.90	0.78	0.85	9	9	1	4
Deep NN	0.74	0.77	0.70	0.77	0.75	10	7	3	3

Table 1. Test-set performance for each model. Accuracy, Sensitivity (TPR), Specificity (TNR), F1, and AUC are shown, along with confusion matrix counts (TP, TN, FP, FN).

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Breast Cancer Classificatoin		Breast Cancer Classificatoin
lab1		lab1
lab2		lab2
lab3		lab3
lab4		lab4
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STAT-562: Machine Learning

Final Project (Breast Cancer Classification):

Model Performance on Breast cancer Classification.

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

STAT-562: Machine Learning

Final Project (Breast Cancer Classification):

Model Performance on Breast cancer Classification.

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages