Hybrid Explainable Framework for Stroke Prediction is an advanced AI-powered system for early stroke prediction that combines machine learning, deep learning, and explainable AI techniques. This comprehensive framework enhances predictive accuracy while ensuring clinical interpretability for healthcare applications.
β οΈ Notice: This project is currently under publication. Only the web-based interface and essential components are included for demonstration purposes. Full model architecture, dataset preprocessing scripts, and training configurations will be released post-publication.
The dataset exhibits significant class imbalance with only 4.9% stroke-positive cases, necessitating advanced resampling techniques for robust model training.
Age Analysis: Stroke-positive individuals show distinct age distribution patterns, with higher risk observed in older demographic groups.
BMI Analysis: Body Mass Index distributions reveal subtle differences between stroke and non-stroke cases, informing feature engineering strategies.
Glucose Level Analysis: Average glucose levels demonstrate significant variations between stroke-positive and negative cases, highlighting its importance as a clinical predictor.
Multi-faceted evaluation across 12 machine learning algorithms demonstrating performance trade-offs across accuracy, F1-score, precision, recall, ROC-AUC, and PR-AUC metrics.
Receiver Operating Characteristic curves showing strong discriminative power across all evaluated models, with consistent performance across different classification thresholds and excellent area under curve values.
Combined SHAP and LIME analysis revealing key clinical features contributing to stroke prediction, with age and glucose levels emerging as dominant risk factors in model decision-making.
Our framework implements a comprehensive workflow:
- Data Preprocessing: Handling class imbalance, missing values, and feature normalization
- Feature Engineering: Clinical feature transformation and selection
- Hybrid Modeling: Machine learning and deep learning ensemble approaches
- Model Interpretation: Explainable AI techniques for clinical transparency
- Performance Validation: Comprehensive evaluation across multiple metrics
- π€ Hybrid AI Approach: Combines traditional machine learning and modern deep learning models
- π Explainable Predictions: Transparent feature importance for clinical trust
- π Robust Preprocessing: Advanced handling of class imbalance and data quality
- π Comprehensive Evaluation: Multi-metric assessment across diverse algorithms
- π₯ Clinical Relevance: Domain-informed feature selection and interpretation
- β‘ Scalable Architecture: Modular design for healthcare integration
- Programming: Python 3.8+
- Machine Learning: Scikit-learn, XGBoost, LightGBM
- Deep Learning: PyTorch-based architectures
- Explainable AI: SHAP, LIME for model interpretability
- Visualization: Matplotlib, Seaborn, Plotly
- Data Processing: Pandas, NumPy for efficient data manipulation
- Model Optimization: Advanced hyperparameter tuning techniques
Hybrid-Stroke-Prediction/
βββ src/
β βββ data_preprocessing/ # Data cleaning and preparation pipelines
β βββ feature_engineering/ # Feature selection and transformation
β βββ models & evaluation/ # Model implementations & performance assessment
βββ config/ # Model and experiment configurations
βββ tests/ # Comprehensive test suites
βββ requirements.txt # Project dependencies
βββ README.md # Project documentation
Bridges the gap between high-performance AI models and clinical practicality through interpretable and actionable predictions.
Implementation of both traditional machine learning algorithms and modern deep learning architectures for comprehensive predictive performance.
Transparent model reasoning enabling clinical validation and trust in AI-assisted decision making.
Multi-dimensional assessment across accuracy, sensitivity, specificity, and clinical relevance metrics.
- Strong Predictive Performance: Comprehensive model evaluation demonstrating reliable stroke prediction capabilities across multiple algorithms
- Clinical Interpretability: Transparent feature importance analysis aligning with medical domain knowledge
- Robust Generalization: Consistent performance across different validation strategies and data splits
- Scalable Architecture: Modular design suitable for integration with healthcare systems
Our systematic approach encompasses:
- Comprehensive Data Analysis: In-depth exploratory data analysis to understand feature distributions and relationships
- Advanced Feature Engineering: Domain-informed transformations and selection techniques
- Diverse Model Development: Implementation of multiple machine learning and deep learning approaches
- Rigorous Evaluation: Multi-faceted assessment including performance metrics and model interpretability
- Clinical Validation: Framework designed for healthcare professional review and practical application
Β© 2025 Raihan Rashid. All rights reserved.