Breast Cancer Analysis and Prediction

📌 Overview

This project explores Breast Cancer Wisconsin (Diagnostic) data, focusing on exploratory data analysis, dimensionality reduction using PCA, and supervised classification modeling to predict breast cancer diagnosis.

🚩 Project Structure

The project contains three main Jupyter notebooks:

Exploratory Data Analysis (EDA.ipynb)
- Detailed exploratory analysis of dataset features.
- Data visualization, correlation analysis, and feature distribution analysis.
Unsupervised Learning: PCA (unsupervised_learning.ipynb)
- Principal Component Analysis (PCA) applied to reduce dimensionality.
- Analysis of variance explained by principal components.
- Visualization of PCA components.
Supervised Learning: Classification (supervised_learning.ipynb)
- Logistic regression and other classification models for predicting diagnosis.
- Performance evaluation using accuracy, precision, recall, and F1-score.
- Optimization via hyperparameter tuning.

🗃️ Dataset

The dataset utilized in this project is the Breast Cancer Wisconsin (Diagnostic) dataset, consisting of various features describing cell characteristics, with labels indicating malignant or benign tumors.

🛠️ Libraries & Tools

Data manipulation: Pandas, NumPy
Visualization: Matplotlib, Seaborn
Machine Learning: Scikit-learn, Statsmodels
Interactive Environment: Jupyter Notebook

🎯 Project Goals

Understand and visualize breast cancer data.
Perform feature selection and dimensionality reduction.
Develop accurate predictive models for diagnosis classification.
Clearly communicate results through visual and statistical summaries.

🚀 Getting Started

Installation

Clone this repository:

git clone https://github.com/v4nui/breast-cancer-classification.git
cd breast-cancer-classification

Setup the environment

pip install -r requirements.txt

Run Jupyter Notebook

jupyter notebook

📚 Resources

Breast Cancer Wisconsin (Diagnostic) Dataset on Kaggle
Scikit-learn Documentation
PCA Explained
Logistic Regression In-depth Guide
Ironhack learning materials
StatQuest YouTube Channel
OpenAI ChatGPT

📧 Contact

For questions or feedback, please reach out to vanuhi@live.com.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
models		models
notebooks		notebooks
plots		plots
presentation		presentation
utils		utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Breast Cancer Analysis and Prediction

📌 Overview

🚩 Project Structure

🗃️ Dataset

🛠️ Libraries & Tools

🎯 Project Goals

🚀 Getting Started

Installation

Setup the environment

Run Jupyter Notebook

📚 Resources

📧 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Breast Cancer Analysis and Prediction

📌 Overview

🚩 Project Structure

🗃️ Dataset

🛠️ Libraries & Tools

🎯 Project Goals

🚀 Getting Started

Installation

Setup the environment

Run Jupyter Notebook

📚 Resources

📧 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages