During a recent client project involving data cleaning, I encountered a particularly intriguing problem. Recognizing its potential value for future endeavors, I made the decision to delve into the issue and document my findings in a blog post. This blog aims to shed light on the problem I faced, providing insights and solutions that can be invaluable to individuals engaged in similar data cleaning tasks. By sharing my experiences and strategies, I hope to empower others in their data cleaning endeavors and contribute to the broader data community.

Problem Statement:

Our primary objective was to transform the original table, which contained quarterly export values for Australia, into a format that represented the financial year values. To achieve this, we needed to aggregate the quarterly values from four separate columns into a single column representing the financial year. This conversion would enable us to analyze and compare export values on a yearly basis rather than a quarterly one. By consolidating the data in this manner, we aimed to gain a clearer understanding of Australia’s export performance over each financial year.

Step 1: Import the file to Alteryx 

When we exported the data into Alteryx, it appeared in a messy format that wasn’t what we wanted for the final result. The data was divided into two rows: one for the year and another for the quarter. This made it difficult to work with and analyze effectively. To fix this, I decided to convert the quarter numbers into their respective months and then combine them with the corresponding year. By doing so, we could obtain a format that better suited our needs and allowed us to make sense of the data more easily.

Step 2: Data Preparation

Since the table contains numerous null rows and columns, it necessitates some data preparation work. To address this issue, we can utilise Alteryx’s Data Cleansing, Select, and Select Records tools.

Step 3: Separating the Table into Header and Values and Concatenating the Header

We use the Select Record Tool to separate the table into header and values, it looks like this:

we can use a Formula Tool to assign value = 1 to F6

Then we can group row1 and row2 by F6 and Concatenating them by using the Summarise Tool

 

 

 

 

 

 

Output:

To make the date format more neat, we can use the Multi-field Tool to replace the comma with a hypen

Step 4: Dynamic Rename the title and Union the “Header Rows” with the “Value Rows”

After combining the year and month rows in Alteryx, we noticed that the column names were automatically prefixed with “Concat_”. To remove this prefix and have the column names display only the concatenated year and month values, we can use the Dynamic Rename Tool in Alteryx. This tool allows us to modify column names dynamically, ensuring that the desired column names are achieved without the unwanted prefix.

 

After that we can use the Union Tool to Stack the “Header Row” With the “Value Rows” by choosing” Auto Config by Name” , then use the Dynamic Rename Tool again to promote the first row to Header.

Step5: Transpose the data and group every four records as one Financial Year

Now we got a table like this, we need to Transpose rows with header of Y-M to columns. To achieve this, we can use a Transpose Tool.

Keep the Column “Type” as the Key Column

Output:

Step6: Generate RecordID and Group by Financial Year

in order to sum the value for each financial year, we can firstly use the Record ID Tool to generate record ID for each rows, then we can use the Formula Tool to group every four records as a Financial Year

Output:

Step 7: Change Data Type, Summarize, and Add Financial Year Column

Since we already have the records for each Type for each Financial Year, we can use the Select Tool to change the Value to Double, then use the Summarize Tool to group them by RecordID and Type. Finally use the Formula Tool to generate a Field for the name of the Financial Year.

Output:

Then we complete the task!

Hope you like this blog.

 

 

 

The Data School
Author: The Data School