Day two of Dashboard Week was the day of the Melbourne Cup. The Melbourne Cup is a famous annual horse race here in Australia. So for dashboard week we were tasked with web scraping statistics on the horses from various betting websites.
Because of how the website was set up, the download tool was not able to scrape the pages. Alekh from DSAU2 provided us with a Python script that would download the data. There were three main tools that I used to parse the data quickly:
1. The Tokenize output method on the Regex Tool
Tokenize finds every match of the Regular Expression specified in the data. By splitting this to rows it was quick to find all the statistics for all the horses at once.
2. The Tile Tool
The tile tool was not a tool that I had used before but it was good to learn how easily it can create group numbers and sequence numbers. Where before I would use a multi-row tool, the tile tool did all the work for me. By specifying 24 tiles for the number of horses in the data set it created 24 tile numbers for each horse and 22 sequence numbers for each part of the data. This meant being able to quickly cross-tab the data for each horse.
3. Multi-Field Formula Tool with the Regex Replace Function
After parsing the data I found there were many fields in a similar format with the number of races a horse had raced and the 1st, 2nd and 3rd place out of these races. By using the Multi-Field formula tool I could just write one expression to isolate what number I wanted for all the fields at once. The Regex Replace function was used to replace the parts of the field that wasn’t the part I wanted with blanks.
The final Alteryx workflow can be seen below:
For the final Dashboard I wanted to visualize how the odds of the Bookies varied compared to the final placing of the horses. The winner of the Melbourne Cup 2019, Vow And Declare was ranked 4th in terms of odds before the race. I found that the top 8 horses by final placing were quite high in terms of the odds given to them, making the predictions quite accurate. The final dashboard can be seen below and viewed on Tableau Public here.