Heart Attack Risk Prediction

This project aims to predict the risk of a heart attack using various machine learning techniques. The primary focus is on using the k-nearest neighbors (KNN) algorithm on a dataset that has been balanced using the Synthetic Minority Over-sampling Technique (SMOTE).

Introduction

Heart disease is a leading cause of death globally. Early prediction and diagnosis can help in reducing the risk of severe outcomes. This project utilizes machine learning to predict the likelihood of a heart attack based on various health metrics.

Dataset

The dataset used in this project contains the following features:

Age
Total Cholesterol (totChol)
Systolic Blood Pressure (sysBP)
Diastolic Blood Pressure (diaBP)
Body Mass Index (BMI)
Heart Rate
Glucose

The target variable is TenYearCHD, indicating the presence of heart disease over ten years.

Feature Selection

Feature selection was performed using the Boruta algorithm to identify the most significant features for prediction. The selected features are:

Age
Total Cholesterol
Systolic Blood Pressure
Diastolic Blood Pressure
Body Mass Index
Heart Rate
Glucose

Data Balancing

The dataset was imbalanced, so SMOTE (Synthetic Minority Over-sampling Technique) was applied to balance it. This helps in improving the performance of the machine learning models by providing more balanced training data.

Model Training

The k-nearest neighbors (KNN) algorithm was used for training the model. GridSearchCV was utilized to find the best hyperparameters for the KNN model.

Model Evaluation

The model was evaluated using accuracy and a confusion matrix to understand its performance.

Usage

To use this project, follow these steps:

Clone the repository:

git clone https://github.com/yourusername/heart-attack-risk-prediction.git

Install the required dependencies:
```
pip install -r requirements.txt
```
Run the Jupyter notebook to train the model and make predictions.

Example Predictions

To predict the risk of a heart attack for a high-risk individual:

h_risk = [[65, 150, 180, 70, 26.97, 80, 77]]
prediction_risk = knn_clf_best.predict(scaler.transform(h_risk))
print('You are safe. 😊') if prediction_risk[0] == 0 else print('Sorry, You are at risk. 👽')

To predict the risk of a heart attack for a low-risk individual:

h_safe = [[39, 195, 106, 70, 26.97, 80, 77]]
prediction_safe = knn_clf_best.predict(scaler.transform(h_safe))
print('You are safe. 😊') if prediction_safe[0] == 0 else print('Sorry, You are at risk. 👽')

Dependencies

Python 3.x
pandas
numpy
seaborn
matplotlib
scikit-learn
imbalanced-learn
statsmodels

Results

The KNN model achieved an accuracy of approximately 85.59%. The confusion matrix provides further insight into the model's performance.

Contributing

Contributions are welcome! Please fork the repository and create a pull request with your changes.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
FINAL MAJOR PROJECT REPORT.docx		FINAL MAJOR PROJECT REPORT.docx
README.md		README.md
framingham.csv		framingham.csv
heart_attack_risk_prediction.ipynb		heart_attack_risk_prediction.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heart Attack Risk Prediction

Table of Contents

Introduction

Dataset

Feature Selection

Data Balancing

Model Training

Model Evaluation

Usage

Example Predictions

Dependencies

Results

Contributing

License

About

Uh oh!

Releases

Packages

Languages

1sheca/Heart-Attack-Risk-Prediction

Folders and files

Latest commit

History

Repository files navigation

Heart Attack Risk Prediction

Table of Contents

Introduction

Dataset

Feature Selection

Data Balancing

Model Training

Model Evaluation

Usage

Example Predictions

Dependencies

Results

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages