Within the five boroughs that make up New York City, around 8.2 million people dwell. Thousands of accidents occur each year as a result of a variety of circumstances. The NYPD gathers data on each of these accidents and makes it available to the public on nycopendata.socrata.com. We decided to dig deeper into the crash data to see if there were any underlying patterns or relationships that could explain the high frequency of collisions. From July 2012 to March 2022, the data included almost 2,00,000 observations.
- We accessed the data from "nycopendata.com" using
Open Data API(OData API) and performed Data Connection with Tableau. - Then, we cleaned the data using Python and stored it in
Google Cloud Storageas a Bucket to create a virtual instance. - We performed our analysis using
Google's Big Queryin Google Cloud Platform and stored the query results in CSV files. - After our analysis, we have generated a report using Google Sites to share our Insights and give Recommendations.
Tableau Story Link: https://public.tableau.com/app/profile/aditya.agarwal1269/viz/NYPDMotorCollisionProject/Story1
Report Link (Google Slides): https://drive.google.com/file/d/16r6KAuHcV5lPYZfqCMQMxOaRkbRwGH77/view?usp=sharing
The Motor Vehicle Collisions crash table contains details on the crash event. Each row represents a crash event. The Motor Vehicle Collisions data tables contain information from all police reported motor vehicle collisions in NYC. The police report (MV104-AN) is required to be filled out for collisions where someone is injured or killed, or where there is at least $1000 worth of damage.
Dataset link: https://data.cityofnewyork.us/Public-Safety/Motor-Vehicle-Collisions-Crashes/h9gi-nx95
-
Understanding the Data - It is important to understand our data and our problem statement i.e., how to decrease the number of injuries and deaths in New York City.
-
Preparing the Data - After understanding our dataset, it is essential to prepare the data. We have used GCP Big Query to remove null values and duplicate entries.
-
Perform Analysis - We have carried out a Time-series analysis and made dashboards to understand more about the factors and causes of Motor collisions in New York City.
-
Get Insights - We generated interactive tableau dashboards to support our findings and get insights from the data.
-
Give Recommendations - Based on our analysis, we will provide recommendations to decrease the number of Motor collisions.
Before exploring the data, we created a list of questions we wanted to address:
- Is there a trend in the number of accidents?
- Is there a relationship between the time of day and the contributing factors of the accident? (
Time Series Analysis) - Which areas are more "Collision-prone" areas? (
Collision prone analysis)
- Analysis performed using
Google Cloud Platform:
A) Most Injuries and Deaths were caused due to which Vehicle type?
B) Most of the collisions was caused due to which factor?
- Analysis performed using
Tableau:
A) Detecting Collision-Prone Areas -
B) Time Series Analysis -
- Between
4 pm to 5 pmwas the peak time of the day when the maximum number of people got injured. - The number of people getting injured was
rising from 2012and was at its peak in2018with a value of123,859 injuries. - In 2018, the total number of injured people decreased to
29,604injuries in 2022. - The highest number of deaths and injuries were majorly caused by a lack of
Driver’s attention.The other factors also point toward the Driver’s lack of driving skills. - Most of the accidents were caused by
Sports utility/Station wagonvehicles, followed bySedanandPassenger vehicles. - Also,
4 - wheeledvehicles were more prone to accidents than2 - wheeledvehicles.
- Increase the number of
Traffic Officersbetween4 pm and 6 pmon days with the highest accident rates. - Raise the availability of ambulances between
1 pm to 5 pmin collision-prone areas. - Provide a more robust and efficient
Public transit systemto encourage usage by commuters. - Focus on high collision-prone areas such as
11236,11207, and11234in prioritizing new projects like traffic lights or street signs. - Increase the frequency of
driver re-trainingand morestrict finesfor repeat offenders. - Increase the
awareness about the use of public transportthe commuters instead of walking or using personal vehicles to reduce accidents. - Among all the boroughs,
BROOKLYNandQUEENShad the highest number of deaths in New York City.



