Skip to content

Data-Science-Projects-Code/DenverDriving

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DenverDriving

10+ years of Denver Traffic Accident data to show you're not that bad of a driver



Table of Contents

Description

This is an end-to-end project. It gets data from City of Denver Open Data Catalog does some processing of the data, and uploads the data to Kaggle as the Denver Traffic Accidents dataset.

Note: The City of Denver now hosts their data on an ESRI geodata server with an EmberJS front end. Unfortunately this means I need to figure out Embers's dynamic web pages in order to get around ERSI's default 2000 record limit. This doesn't affect the data on Kaggle now, just updating it.

Project layout:

  • src folder
    • callreq.py - downloads csv of all data with requests
    • pipeline.py - transfrorms data
  • notebooks folder with:
    • Denver_Traffic_Accidents_All_In_One
    • Denver_Traffic_Accidents_Data_Display notebooks
  • images folder with images for both Kaggle and GitHub
  • tests folder

Top

Discussion

According to the CDC, traffic accidents cost Colorado 943M$ in 2018 alone and that number doesn't seem to include the property loss. In 2022 an article in Denverite noted there were just over 3100 wrecks on streets with a speed limit of 25 mph. Of those 3100+ accidents, 84 resulted in a fatality. As a result, the city approved the "20 is plenty" ordinance, cutting the speed limit to 20mph. Both of these articles are good and might be a start on additional visualizations.

Top

Takeaways from the Sample EDA Notebook

Kaggle wants an EDA notebook included with submissions for the dataset to get a perfect rating. I find this annoying as what gets passed off as EDA is often just the output of a generic, minimal script (head(), shape(), describe(), etc). I try to put more thought into it but I don't live in Denver and haven't been there since I was there for a data science program. So I don't have much at stake but I also don't have any local knowledge that could be the basis of deeper insights. I went with questions I wanted the answers to but more as a way of displaying data rather than telling a story with data.

Still, there are a few takeaways.

  1. While there are some common accident sites (shown here with a heatmap) the entire city is fair game.

all Denver accidents over the last 10 years


  1. Covid had a major positive impact on accidents.

all Denver accidents over the last 10 years


  1. Ruling out non-causes such as "No Apparent", "Other" and "Pending Investigation and/or Court Hearing" we can see aggressive and distracted driving are major issues. Notably, cell phones are smaller than what one might expect.

Accidents by Factor


  1. DUI/DWAI/DUID-related accidents match our intuition of happening more in the evening hours

all Denver accidents over the last 10 years


  1. There is a slight degree of seasonality. This is also seen in the monthly chart above, which aggregates the 10 years.

Slight seasonality present Top

Status

  1. Kaggle: Dataset and accompanying notebook uploaded with infrequent updates
  2. Acquiring update infomation: 😖
  3. GitHub: Expanding and refactoring

Top

About

Over 10 years of Denver traffic accident data to show you're not *that bad* of a driver

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors