Select Page

Welcome to yet another dashboard week blog, in this episode we look at some motor accident data collected from VicRoads (Victoria, Australia).

Basically, this dataset looks at all motor vehicle accidents reported to the police in Victoria. Its a very commonly used dataset in Tableau because its freely available to the public but even the visualisations on the government website could use some refreshing. So that’s what I aimed to do yesterday!

The Data

Firstly, its a relatively easy to use dataset. Everything is separated into different tables/sheets/files and they can all be connected by the Accident ID.I wanted to look at how these accidents are affected by time. Not necessarily how they track over the years, but how they track over the day. What times are do most of the accidents occur? And are they affected by the type of accident?

Now there’s two approaches to this, either you look at the plain ol’ time data of the accidents, or you take into account other factors. I decided to look at how the accidents are affected by peak hours, and by the time of sunrise and sunsets. So I collected some sunrise and sunset data for Melbourne from here Sunrise and sunset times Melbourne, 2016 (sunrise-and-sunset.com).

There is an issue with this however. All of the data ranges across all of Victoria and so the time of sunsets/sunrises does vary (due to the land distances). So taking only the Melbourne sunsets/sunrises that isn’t entirely accurate, but there’s only maybe a half hour difference between the extremes. Which I am willing to ignore for the particular exercise.

Time Range Reference Lines

These are the worst. I had no idea that working with time data was not as straight forward as integers. For example, Tableau doesn’t seem to have a good way to calculate averages of dates or times. And so if I have a range of sunrise/sunset times, how do I find out what the average time of a sunset is?

Making static reference lines is easy enough, but making them dynamic based on various inputs is really tricky. In the end, I decided to use a dual axis, with the secondary axis an integer (minutes). This allowed me to convert a time into total number of minutes, then use that number to average the sunset/sunrise times. Then use that as a dynamic reference line.

Overall, it was more difficult than expected. But it was fun to find a workaround.

Relative Ranges

A metric I thought would be more accurate was the difference in accident time to sunrise and sunset. Sunsets vary over time, so taking the absolute time of the day doesn’t reflect the influence of having the sun in your eyes as you drive.

The Goal

My overall goal was to produce the Viz that I WANTED to see when I first heard about the data. But also I wanted to challenge myself to working with dates in a way I haven’t before. Definitely a fun day.

Author: Kevin Prescilla

As a late-stage PhD candidate, Kevin’s appreciation for data analytics grew during his studies into poultry nutrition, or as he calls it, “chickens”. It was this appreciation which spurred his decision to change career paths and ultimately led him to apply to the Data School. In his spare time he enjoys powerlifting – ever challenging himself to beat his last max weight - as well as all kinds of gaming, from board to PC. If Kevin could go anywhere in the world, where would it be and why? Well, the answer is Antarctica, as he is fascinated with how people can live and survive down there (although some might argue because it’s the furthest place you can go on Earth from a chicken).