Skip to content

UriCVV/Smoking-and-Tobacco-Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project: Smoking and Tobacco Data Analysis

This project aims to analyze smoking and tobacco-related data from two different years. The analysis includes examining the prevalence of cigarette smoking, its correlation with various factors, and comparing the impact of health warnings and treatment in health facilities on deaths caused by smoking.

Dataset / Data Source

The datasets used in this project was obtained from kaggle:

These datasets provide information on various aspects of smoking, including cigarette smoking prevalence, tobacco use, GDP, population, and more for different countries.

Currency Api

Some other data was extracted from web scraping in:

Workflow / Methodology

The project workflow and methodology include several key steps:

  1. Data Extraction and Preprocessing: Data is loaded from CSV files and preprocessed to handle missing values and format columns properly. Some columns are dropped for simplicity.

  2. Currency Conversion: Currency conversion is performed to calculate cigarette prices in USD using an API (CurrencyAPI). The API key is entered by the user.

  3. Web Scraping with Selenium: Data on deaths caused by smoking, GDP, and population is obtained by scraping web pages. Selenium is used to navigate and extract data.

  4. Merging and Data Enhancement: The scraped data is merged with the original dataset to enrich it. The dataset is further cleaned, and the order of columns is adjusted.

  5. Export Enhanced Data: The enhanced dataset is exported to a new CSV file for analysis.

Goals / Hypothesis

The main goals of this project are to:

  • Compare cigarette smoking prevalence in 2008 and 2018.
  • Analyze the correlation between smoking prevalence and various factors.
  • Investigate the impact of health warnings and treatment in health facilities on deaths caused by smoking.
  • Examine the relationship between cigarette prices and smoking prevalence.

Analysis

The analysis reveals insights into the changes in cigarette smoking prevalence over the years, correlations between smoking prevalence and GDP, as well as the impact of health warnings and treatment in health facilities on deaths by smoking. We also explore the relationship between cigarette prices and smoking prevalence.

Results

The results of the analysis are presented in several visualizations, including bar charts, scatter plots, a correlation heatmap, and a geospatial map, to provide a comprehensive view of the smoking and tobacco-related data.

Conclusion

In conclusion, this project provides valuable insights into smoking and tobacco-related data, shedding light on trends, correlations, and the impact of various factors. The analysis can be useful for policymakers, health organizations, and researchers interested in understanding and addressing smoking-related issues.

Links

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages