On Day 4 of the Dashboard Week, we have been given the task of getting spatial data from the City of Melbourne Open Data through API to answer a business question of our own interest. I decided to look at the nearby On street Parking bays and the parking restrictions of the places of interest in the City of Melbourne. Finally I have built a dashboard that allows users to choose a place of interest to start with, then identify the walking range of nearby car parks, and finally further filter out by the time restrictions and payment type with the data available. The end to end process of this project is broken down different stages, among which this blog will be focusing on the data cleaning process – Scaffolding the dataset to fill up the days and time period that have no restrictions.

The datasets of the parking restriction illustrates all the restrictions of the weekdays based on the bay id of the parking bay. This means that for those periods of the day, and the days of the week where is no restriction at all, it does not exist in the dataset. But in order to allow the users to get those ‘sweet period’ and ‘sweet days’ based on the filters they choose, it is important to fill those gaps for the users to choose. This blog will introduce this process step by step: 1. Pivoting the dataset 2. filling in the days of the week with no restrictions 3. filling in the time of the day with no restrictions

1. Pivoting the dataset

The original dataset is a very wide table, with each row of data containing represents one Parking Bay with all the information about the time restrictions in different columns. The first step I took is to pivot the table, putting each time period into different roles, with other correspondent information in different columns within the same row.

2. Filling in the days of the week with no restrictions
The next step is getting a dataset where each bay_id has 7 rows of data indicating each day of the week. This is achieved by a summarize tool to group by bay_id and a generate row tool to generate 7 rows of data which is then appended to each bay_id. As a result we get the dataset with all the possible combinations of bay_id and weekdays, which is then joined by the original dataset to see which days do not have restrictions at all (the rows come out from the right of the join tool). These days are marked with “Free” and “No Restriction” by a formula tool, which will be unioned to the original dataset.

3. Filling in the time of the day with no restrictions

The next step is to filling the gaps of the day where there is no restrictions and free. To achieve this, I use a summarize tool to group by bayid and day of the week, and then for each bay on each day of the week, get the minimum start time and the maximum end time, which indicates the time period that are restricted for parking. And then 2 gaps are created for each day each bay: 1. start of the day to start of the restriction 2. end of restriction to end of the day (here we assume that there are no gaps between all the restriction period, which covers most of the cases). And then we pivoted the gaps into rows and created the tag “Free” and “No Restrictions” for those time period of the day, which will be unioned with the previous two datasets.

After we union the three datasets, we now have the full datasets with a complete table with all possible time period and days of the week in it, ready for tableau to visualise.

If you are interested in the final viz, please follow the link below to the Dashboard:


The Data School
Author: The Data School