Have you ever bought powerball? Do you know if you analyze the trend of the ball, you would get slightly higher chance of winning?

Today, I will show you how to use Alteryx to get the historical powerball data in one table.

What we need?

  1. url:https://australia.national-lottery.com/powerball/results-archive-2023 —- you can change the last 4 digit with the year you want
  2. Alteryx

How to?

Step 1: Understand the HTML structure

You don’t need to be an HTML expert to find the data you needed. You only need to know where to find the data.

For the powerball data I need, it is all inside theelement, so I just need to figure out how to get that part of data.

Step 2: Figure Out the Regex Expression You Need

To capture the data we wanted, we could use Regex Expression. To have a brief overview of the data, you could refer to this article.

The Regex Expression we need in this example are:

.* : capture everything between “” and “”.

.*?:capture everything and every occurrence between “” and “”.

Step 3: Build a Workflow

In these workflows, there’re total 5 Regex tool included, and the function is describe as below:

1st Regex Tool:

Function: Get data between <tbody> and </tbody>

Regex Expression: <table class=”table powerball mobFormat mobResult” style=”width: 100%;”>(.*)</table>

Output Method: Parse

2nd Regex Tool:

Function: Get data in each draw in separate rows

Regex Expression: <tr>(.*?<td class=”noBefore colour”>.*?</td>.*?</td>.*?</td>.*?)</tr>

Output Method: Tokenize

3rd Regex Tool:

Function: Separate draw info (date and number) data, ball numbers and winners’ data into different columns

Regex Expression: <td .*?>(.*?)</td>

Output Method: Tokenize

4th Regex Tool:

Function: Get data between <tbody> and </tbody>

Regex Expression: <strong>(.*?)</a>

Output Method: Parse

5th Regex Tool:

Function: Separate ball numbers in different columns

Regex Expression: <li class=.*?>(.*?)</li>

Output Method: Tokenize

All the workflows you can found in this link:https://drive.google.com/drive/folders/1ArJ7oP3jVHC6Qh0RvjGJ7y_tlc6a8Icg?usp=drive_link

Yi Gao
Author: Yi Gao

Yi has a master’s degree in data science from The University of Sydney and a background in engineering and manufacturing. She is passionate about finding insights from large and diverse datasets and applying them to real-world scenarios. In her previous role at Daimler China, she analysed vehicle usage data, provided recommendations and created dashboard for internal customers. In her spare time, she enjoys photography, especially of animals and her two sons, and cooking traditional Chinese food.