[Paula Iglesias, Pau Navarro & Marta Palleiro]
[Data Analytics, Barcelona, June 2020]
We analyze the available data on pollution during the first half of 2018 and the data on public transport in Barcelona and try to grasp the relationship between the two.
The main hypothesis we work for is the positive influence of the availability of public transport options on the air pollution of the district. Then, we also try to see what districts have more pollution, what is the quality of the air on the districts, the density of public transport and the types in every district.
We used the datasets on Transportation called "public-transport.csv" and "bus-stops.csv". We also used the dataset on Urban Evironment called "air_quality_2018_project2.csv". Both databases can be found on this repository.
- Fork repository
- Clone github
- Create gitignore
- Select a topic
- Research the topic
- Brainstorm questions
- Make hypothesis
- Define the structure to follow for the questions
- Import datasets
- Clean datasets
- Perform analysis
- Reach conclusions
- Prepare complementary documents
To organize ourselves, we kept a Trello board and created a slack chat.
This are some interesting links: