This project explores the prediction of sleep disorders using machine learning techniques. It now includes a multi-page Streamlit web application with dedicated sections for general users (Demo), healthcare professionals (Clinician Portal), and technical audiences (Model Evaluation). Additionally, it still supports command-line interaction via a CLI test script.
This project predicts the likelihood of a sleep disorder—Sleep Apnoea, Insomnia, or None—by processing user data (such as gender, age, BMI, sleep quality, stress level, etc.) through a trained machine learning model.
Core components include:
- Data Analysis: Exploratory data analysis (EDA) via Jupyter notebooks.
- Model Training: Training machine learning models with feature engineering and validation.
- Prediction Pipeline: Integrating custom transformers (e.g. BMI categoriser, BP classifier, encoders, scalers) with a trained model.
- CLI Interaction: A command-line test script (
helper/test.py) for quick terminal-based predictions. - Web App Interface: A multi-page Streamlit app comprising Home, Demo, Clinician Portal, and Model Evaluation pages.
-
Custom Transformers:
- BMICategoriser — categorises BMI into standard risk groups.
- BPClassifier — classifies blood pressure into clinical categories.
-
Robust Pipeline:
- Encodes categorical variables, scales numerical features, and applies custom transformers.
- Uses a Support Vector Classifier (SVC) with probability calibration to output prediction confidences.
-
Streamlit Web Application:
- Home Page — overview of the project and contact form.
- Demo Page — public-friendly prediction interface.
- Clinician Portal — doctors verify model predictions, provide corrected diagnoses, and submit feedback.
- Model Evaluation Page — detailed performance metrics, including confusion matrix, classification report, ROC-AUC/PR-AUC plots, log loss, and SHAP summary.
-
Data Collection Loop:
- Clinicians can confirm or correct model predictions, and that feedback is appended to a Google Sheet for model improvement.
-
Command-Line Test Script:
- Quickly test predictions in the terminal using
helper/test.py.
- Quickly test predictions in the terminal using
| Page | Description |
|---|---|
| Home | Introduction to the project, developer “About Me” section, and a contact form for general enquiries or feedback. |
| Demo | User-friendly interface allowing non-technical users to enter health metrics and receive a sleep disorder prediction with confidence score. |
| Clinician Portal | Secure area for healthcare professionals only to:
|
| Model Evaluation | Technical dashboard showing:
|
SleepDisorder/
├── .streamlit/
│ └── config.toml # Theme settings (light/dark)
├── helper/
│ ├── forms.py # Streamlit form for clinician feedback + Google Sheets integration
│ ├── test.py # CLI test script for terminal predictions
│ └── utils.py # UserDataCollector: input validation and conversion
├── pages/
│ ├── home.py # Home page: project overview, About Me, contact form
│ ├── demo.py # Demo page: public prediction interface
│ ├── doctors.py # Clinician Portal page: prediction confirmation and feedback
│ └── evaluation.py # Model Evaluation page: performance metrics and visualisations
├── data/
│ ├── banner_image.png # Banner image for Home page
│ ├── data.csv # Primary dataset (if needed for reference)
│ ├── notebook.ipynb # Jupyter notebook for EDA and model training
│ ├── logo.png # App logo
│ └── shap_value.png # SHAP summary plot for Model Evaluation
├── models/
│ ├── model.pkl # Trained prediction model (pickle file)
│ ├── feature_encoder.pkl # Pre-fitted feature encoder (pickle file)
│ ├── target_encoder.pkl # Pre-fitted target encoder (pickle file)
│ ├── scaler.pkl # Pre-fitted scaler (pickle file)
│ └── model_evaluation.pkl # Serialized performance metrics (pickle file)
├── transformers/
│ ├── bmi_categorizer.py # BMICategoriser transformer
│ ├── bp_classifier.py # BPClassifier transformer
│ ├── saved_encoder.py # SavedEncoderTransformer for categorical features
│ ├── saved_scaler.py # SavedScalerTransformer for numerical features
│ └── feature_correcter.py # FeatureCorrecter for cleaning raw inputs
├── app.py # Main Streamlit entry point (defines navigation and common UI elements)
├── config.py # Central configuration: file paths to images, models, etc.
├── LICENSE # MIT License
├── requirements.txt # Python package requirements
├── .gitignore # Files and folders to ignore in Git
└── README.md # This documentation file
-
Clone the Repository
git clone https://github.com/iamcbn/SleepDisorder.git cd SleepDisorder -
Create a Virtual Environment
It is highly recommended to use a virtual environment to isolate dependencies.
Windows
python -m venv venv .\venv\Scripts\activate
macOS/Linux
python3 -m venv venv source venv/bin/activate -
Install Dependencies
pip install -r requirements.txt
-
Configure Streamlit Secrets (for Google Sheets Integration) If using clinician feedback storage via Google Sheets, create
.streamlit/secrets.tomlwith your service account credentials in the following format:[connections.gsheets] type = "service_account" project_id = "your-project-id" private_key_id = "YOUR_PRIVATE_KEY_ID" private_key = """-----BEGIN PRIVATE KEY----- YOUR_PRIVATE_KEY_CONTENT -----END PRIVATE KEY-----""" client_email = "your-service-account@your-project-id.iam.gserviceaccount.com" client_id = "YOUR_CLIENT_ID" auth_uri = "https://accounts.google.com/o/oauth2/auth" token_uri = "https://oauth2.googleapis.com/token" auth_provider_x509_cert_url = "https://www.googleapis.com/oauth2/v1/certs" client_x509_cert_url = "https://www.googleapis.com/robot/v1/metadata/x509/your-service-account%40your-project-id.iam.gserviceaccount.com"
To launch the multi-page web application, run:
streamlit run app.pyThen open the URL printed in your terminal (e.g. http://localhost:8501) in your browser.
For quick terminal-based predictions without the web interface:
-
Navigate to the helper folder:
cd helper -
Run the CLI script:
python test.py
This will prompt you for the required inputs, run the prediction pipeline, and display the predicted sleep disorder in the terminal.
The Streamlit app is hosted on Streamlit Community Cloud:
-
Push your repository to GitHub.
-
Sign in to Streamlit Cloud.
-
Create a new app, selecting your GitHub repo and setting
app.pyas the entrypoint. -
Ensure the following files are present in the GitHub repo for successful deployment:
app.pyrequirements.txt
🔗 Live App: sleepdisorder-iamcbn.streamlit.app
-
App crashes on start
- Confirm that the paths in
config.pyare correct and match your folder structure.
- Confirm that the paths in
-
Missing Packages
- If you encounter
ModuleNotFoundError, ensure you’ve runpip install -r requirements.txtin your virtual environment.
- If you encounter
-
Secret or Google Sheets Errors
- Double-check your
.streamlit/secrets.tomlformatting (proper TOML syntax). - Make sure the service account email has edit access to the target Google Sheet.
- Double-check your
Training a model is not enough, the model needs to be evaluated with real-world data to understand its performance. Please contact me if you would like to test my model (For clinicians only)
In the next phase, I plan to make this project accessible by deploying an API. The envisioned steps include:
-
Backend with FastAPI:
Develop a RESTful API that takes JSON input, validates and preprocesses data, runs predictions, and returns results. -
Deployment:
Host the API.
Implementing this API will help you understand and gain hands-on experience with modern deployment practices.
Contributions, suggestions, and improvements are welcome! To contribute:
- Fork the repository.
- Create a new branch for your feature or bugfix.
- Commit your changes with clear messages.
- Submit a pull request and describe your modifications or enhancements.
Please adhere to the existing code style and add appropriate tests or documentation where necessary.
This project is licensed under the MIT License. See the LICENSE file for full details.
Made with 💚 by @iamcbn
