This project analyzes the key factors influencing happiness across countries from 2015 to 2019 using data from the World Happiness Report. By examining economic, social, and health-related indicators, we aim to understand:
- The main determinants of happiness across nations.
- How these determinants change over time.
- The countries with the highest and lowest happiness scores, and what influences them.
- The correlation between economic factors (such as GDP per capita) and happiness compared to health, freedom, and social support.
This analysis provides insights that can guide policymakers in improving societal well-being.
- What are the main factors influencing happiness across countries?
- How have these factors changed over the years (2015-2019)?
- Which countries have the highest and lowest happiness scores, and why?
- How strong is the correlation between economic factors (GDP per capita) and happiness compared to other variables like health, freedom, and social support?
We use the World Happiness Report datasets from Kaggle:
π World Happiness Report on Kaggle
Country: Name of the countryRegion: Geographical region (for some years)Happiness Rank: The ranking of happiness by countryHappiness Score: The overall happiness scoreEconomy (GDP per Capita): Economic prosperity measureFamily: Social support measureHealth (Life Expectancy): Life expectancy metricFreedom: Freedom to make life choicesTrust (Government Corruption): Perceived corruption in governmentGenerosity: Donations and volunteering behavior
/happiness-analysis
βββ /data/ # Raw and processed datasets
β βββ 2015.csv
β βββ 2016.csv
β βββ 2017.csv
β βββ 2018.csv
β βββ 2019.csv
βββ /src/ # Python scripts for analysis
β βββ data_cleaning.py # Handles missing values & fixes inconsistencies
β βββ data_merging.py # Merges datasets into one DataFrame
β βββ exploratory_analysis.py # Exploratory data analysis (EDA)
β βββ correlation_analysis.py # Correlation analysis between factors
β βββ regression_model.py # Regression model predicting happiness scores
βββ /notebooks/ # Jupyter Notebooks
β βββ happiness_analysis.ipynb # Full Jupyter notebook report
βββ /docs/ # visualizations
β βββ correlation_matrix.png # Heatmap visualization
β βββ actual_vs_predicted.png # Scatter plot of actual vs predicted scores
β βββ happiness_distribution.png # Boxplot of happiness scores
βββ .gitignore # Ignore unnecessary files
βββ README.md # Documentation
βββ LICENSE # Open-source license (optional)
βββ requirements.txt # Dependencies
βββ Makefile # Automate script execution
This heatmap visualizes the relationships between key happiness indicators.
A scatter plot comparing the actual and predicted happiness scores from the regression model.
This boxplot shows how happiness scores have changed across the years.
git clone https://github.com/Melasery/happiness-analysis.git
cd happiness-analysispip install -r requirements.txtExecute the scripts in order:
# 1. Data Cleaning
python src/data_cleaning.py
# 2. Merging Datasets
python src/data_merging.py
# 3. Exploratory Analysis
python src/exploratory_analysis.py
# 4. Correlation Analysis
python src/correlation_analysis.py
# 5. Regression Model
python src/regression_model.py- Checked for missing values and handled them appropriately.
- Standardized column names across datasets.
- Saved a cleaned dataset for consistency.
- Combined data from 2015 to 2019 into a single DataFrame.
- Ensured consistency across different years.
- Descriptive statistics (
.describe()) - Happiness score distribution visualizations
- Identification of trends over time.
- Calculated the correlation matrix to measure relationships.
- Generated a heatmap to visualize strong/weak correlations.
- Built a linear regression model to predict happiness scores.
- Evaluated performance using:
- Mean Squared Error (MSE)
- R-squared (RΒ²) Score
- Interpreted feature importance from regression coefficients.
- GDP per capita, social support, and health were the top predictors of happiness.
- Trust in government had a weaker influence.
- Happiness scores showed regional stability but slight shifts in ranking.
- Some nations showed economic-driven increases/decreases in happiness.
- Our regression model achieved an RΒ² score of 85%, indicating a strong predictive ability.
Marouan El-Asery


