Web scraper implemented in Python for collecting data on coffee.
-
coffeescraper.py and coffeescraper2.py - files with Python code to scrape data from chosen retailers;
-
data_cleaning.py - file with Python code to clean the scraped data, including missing values, outliers, data formatting etc.;
-
data - folder with datasets in json format;
-
data/c1 - folder with pictures scraped in process but not used in further processing;
Present web scraper collects data from online coffee retailers such as price, weight, grind, roast etc. in order to gather a data set for statistical modelling of data related to coffee brewing methods and pricing strategy. Project developed as part of the AiCore fellowship.