Depression Detection Using Behavioral Data and PHQ4 Score

This repository contains a machine learning project aimed at forecasting depression levels based on a combination of activity data, survey results, and mobile data. The dataset used includes information such as user activity, sleep data, and survey results (specifically the PHQ-4 test) to predict depression levels. The project includes various stages such as data preprocessing, exploratory data analysis (EDA), feature engineering, dimensionality reduction using PCA, and model application.

The present repository contains the solutions to the FDS Final Project for the year 2024/2025.

Collaborators (Group 30):

Emre Yesil (1emreyesil)
Recep Yılmaz (Rezeb)
Nihal Yaman Yılmaz (Nihal-yaman)

Files Overview

main.ipynb: This is the main notebook containing the solutions to the project, along with a command to install the required packages.
dataset (Folder): This folder include the dataset csv files that we used in our porject.
main.rar: This rar file is the main.ipynb file's zipped file. Our ipynb file is large, if you have any problem to download, you can use this file.

Project Abstract: Depression Detection Using Behavioral Data and PHQ4 Score

The project is organized as follows:

Data Preprocessing
In this step, data cleaning and feature selection were performed. Features that showed the least correlation with depression scores were removed based on correlation matrix analysis and OLS (Ordinary Least Squares) reports. This helped improve the model's efficiency and focus on relevant features.
Exploratory Data Analysis (EDA)
EDA was performed to understand the distribution of the features, identify any missing values, and understand the relationships between the features and the target variable (depression score). Various visualizations, including histograms, box plots, and correlation heatmaps, were used to explore the dataset.
Feature Engineering
New features were derived from the existing data to better capture the patterns related to depression. Additionally, feature scaling techniques were applied to normalize the data, ensuring that the features are comparable in terms of scale.
Principal Component Analysis (PCA)
PCA was applied to reduce the dimensionality of the dataset, making it easier to visualize and interpret the data. This step helped to identify the most important components that contribute to the variance in the data.
Model Application
Several machine learning models were applied to predict depression scores. To handle class imbalance, SMOTE (Synthetic Minority Over-sampling Technique) was used to oversample the minority class. Hyperparameter tuning using GridSearch was also conducted to optimize model performance, with a focus on finding the best learning rate and number of leaves for tree-based models.
Conclusion
The models were evaluated using different performance metrics, including accuracy, precision, recall, and F1-score. The results indicated that the feature engineering and model selection steps significantly impacted model performance. Further improvements could be made by enhancing the dataset with additional features and applying more advanced algorithms such as deep learning.

PHQ-4 Test

The PHQ-4 (Patient Health Questionnaire-4) is a four-item screening tool designed to assess the severity of depression and anxiety. The test consists of the following questions:

Little interest or pleasure in doing things?
Feeling down, depressed, or hopeless?
Feeling nervous, anxious, or on edge?
Not being able to stop or control worrying?

These questions are using for calculating PHQ-4 Score. In this project first 2 question of them were used to calculate the depression score in this project, which is a key target variable for model training and evaluation.

As you can find the csv files that we used but our main dataset was College Experience Study Dataset. Thanks for collobration.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
dataset		dataset
README.md		README.md
main.ipynb		main.ipynb
main.rar		main.rar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Depression Detection Using Behavioral Data and PHQ4 Score

Collaborators (Group 30):

Files Overview

Project Abstract: Depression Detection Using Behavioral Data and PHQ4 Score

PHQ-4 Test

About

Uh oh!

Releases

Packages

Languages

1emreyesil/FDS_Final_Project

Folders and files

Latest commit

History

Repository files navigation

Depression Detection Using Behavioral Data and PHQ4 Score

Collaborators (Group 30):

Files Overview

Project Abstract: Depression Detection Using Behavioral Data and PHQ4 Score

PHQ-4 Test

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages