This project leverages Databricks and Azure to build a data processing system for analyzing electric car sales and accidents in the UK. The system employs a medallion architecture pipeline to efficiently process and analyze data, providing valuable insights into the trends and patterns of electric vehicle (EV) adoption and safety.
Data Ingestion: Utilizes Databricks to extract data from various sources, including sales records, accident reports, and external datasets.
Data Transformation: Cleanses and transforms the raw data to ensure consistency and accuracy, using Azure Databricks' powerful processing capabilities.
Data Storage: Utilizes Delta Lake to write and read to store transformed data in a scalable and secure Azure data lake, ensuring easy access and efficient querying.
Data Analysis: Performs comprehensive analyses on the data to identify key trends and patterns in EV sales and accidents, using Databricks' advanced analytics tools.
Visualization: Presents the analysis results through intuitive and interactive visualizations, enabling stakeholders to make data-driven decisions.
-
To understand the current landscape of electric vehicle sales in the UK.
-
To analyze the correlation between the increase in electric vehicle adoption and the occurrence of accidents.
-
To provide actionable insights and recommendations for improving road safety and promoting the adoption of electric vehicles.
- Databricks
- Azure
- Apache Spark
- Pyspark
- Databricks SQL / Apache Spark SQL
https://www.data.gov.uk/dataset/cb7ae6f0-4be6-4935-9277-47e5ce24a11f/road-accidents-safety-data
https://www.gov.uk/government/statistical-data-sets/vehicle-licensing-statistics-data-files

