This project aims to develop machine learning models to predict CO2 emissions and categorize boroughs Low, Medium, or High Emission Areas as well as into emission intensity clusters based on road characteristics and vehicle types.
The project utilizes a dataset containing information on road characteristics and pollution caused by different types of vehicles. The dataset includes features such as borough name, road length, type of pollutant emitted, and the amount of pollution caused by petrol, diesel, and electric vehicles.
Exploratory data analysis was conducted to understand the characteristics of the dataset.
A linear regression model was trained to predict CO2 emissions based on road length and pollution caused by different types of vehicles.
A random forest classifier was trained to categorize boroughs into Low, Medium, or High Emission Areas based on their emissions profile.
A K-Means clustering model was trained to group boroughs into clusters based on pollution caused by different types of vehicles.
The performance of each model was assessed using appropriate evaluation metrics.
The machine learning models developed in this project provide valuable insights into CO2 emissions patterns and help categorise boroughs based on emission intensity. Further improvements and refinements to the models could be explored in future work.
- Scikit-learn Documentation - Documentation for scikit-learn library.
- Pandas Documentation - Documentation for Pandas library.
- [Ibrahim Amr]
- [Alexandros Arcudis]
- [Chantal Maskell]
- [Aleksander Palamarczuk]