This repository provides a comprehensive analysis of factors influencing life expectancy across 193 countries from 2000 to 2015. The project utilizes data from the World Health Organization (WHO) and the United Nations, focusing on critical health, economic, and social indicators to derive actionable insights for policy-making and improving global life expectancy.
This analysis aims to address gaps in previous research, particularly the lack of consideration for immunization coverage and the Human Development Index (HDI). By incorporating these factors, the project uses advanced regression models—mixed effects and multiple linear regression—to better understand the relationships between various determinants and life expectancy. The insights gained from this study can help countries prioritize key areas for improving the health and longevity of their populations.
The dataset contains 2,938 rows and 22 columns, with data spanning 193 countries over a 15-year period. Predictive variables are categorized as follows:
- Immunization Factors: Coverage for Hepatitis B, Polio, and Diphtheria
- Mortality Factors: Adult Mortality, Infant Deaths, and Under-Five Deaths
- Economic Factors: GDP and Percentage Expenditure
- Social Factors: Years of Schooling, Population Size, and Prevalence of Thinness
https://drive.google.com/file/d/1s-EXNXC6rPiD5962TjSzmyaT3ADw_QJB/view?usp=sharing
- Life Expectancy: Average lifespan in years
- Adult Mortality: Probability of death between the ages of 15 and 60 (per 1,000 population)
- Alcohol Consumption: Annual per capita alcohol consumption (liters of pure alcohol)
- BMI: Average Body Mass Index
- Immunization Coverage: Percentage coverage for Hepatitis B, Polio, and Diphtheria
- Income Composition of Resources: Proxy measure for income-related factors
- Years of Schooling: Average education level
- Data Cleaning: Conducted a Data Cleaning using Power Query Editor.
- Exploratory Data Analysis (EDA): Conducted in Python, with additional visualization and analysis in Power BI.
- Visualization: Developed an interactive Power BI dashboard to provide actionable insights and trends.
- Correlation Analysis: Generated a Python-based correlation matrix to evaluate relationships between key variables.
- Dynamic Filters: Users can filter data by year, region, and specific indicators.
- Custom Visuals: Advanced Power BI visualizations offer deep insights into life expectancy trends.
- Interactive Storytelling: Highlights relationships between health, education, and socio-economic indicators.
Below is the screenshot of the interactive Power BI dashboard developed for the project:
The correlation matrix evaluates relationships between critical variables such as:
- Alcohol consumption
- Immunization rates (Hepatitis B, Polio, Diphtheria)
- HIV/AIDS prevalence
- Measles cases
- Under-Five Deaths
- Years
- Schooling
- Life Expectancy
The heatmap below illustrates the relationships between key variables, such as immunization rates, life expectancy, and mortality factors:
- seaborn
- matplotlib
- Which factors significantly impact life expectancy?
- Should countries with lower life expectancy invest more in healthcare?
- How do infant and adult mortality rates influence life expectancy?
- What is the relationship between lifestyle habits (e.g., alcohol consumption) and life expectancy?
- What role does education play in determining life expectancy?
- Do densely populated countries have lower life expectancy?
- How does immunization coverage influence life expectancy?
Data used in this analysis was collected from:
- World Health Organization (WHO)
- United Nations Data Repository
Special thanks to Deeksha Russell and Duan Wang for assistance with data collection and preparation.
- Clone the repository: git clone https://github.com/username/life-expectancy-analysis.git
- Run the Python script to generate a correlation matrix.
- Access the Power BI dashboard for an interactive exploration of the dataset.
- Explore the README for detailed project documentation.
- Expand the analysis to include recent data for further insights.
- Explore machine learning techniques for advanced predictive modeling.
- Develop additional visualizations for enhanced storytelling.
This project is licensed under the MIT License. Refer to the LICENSE file for more details.

