The Aim of this blog:
The aim of this blog is to not only demonstrate the power of Alteryx in transforming Raw data. But also show the importance of data cleaning and transformation.
What better ways to show case this than with an example.
Above is a screen shot of our example data set, it is the ABS Data Pack for NSW S02 Areas.
Firstly we can see that there are a lot of 0s in this table.
Secondly the column names are rather difficult to make sense of.
Only upon further research, do we notice that there are critical information hidden in the column names of this data set.
Data Hidden in the Column Name
Not only it is often frustrating, when a data source that have column names, that try too convey too much information that they become meaning less.
It is also often quite hard, to redo the column name and transform the data set at the same time.
Alteryx Transpose Tool, basic configurations
Key Columns: Select the columns (fields) to pivot the table around.
Data Columns: Select all the columns to carry through the analysis.
Missing Columns: Select the action that you want to take if columns are missing.
What does it mean for us ?
By using the Alteryx Transpose Tool, we can easily take the data from the column name use it to generate one or more new columns, with the data from the column name.
This is the Alteryx Transpose Tool configuration for us, we can see that the existing column names displayed in the windows for Key Columns and Data Columns configuration.
For our purpose, we need the SA2_MAINCODE_2016 as our Key Column and the rest of the column as our Data Columns.
This means that the result would have three columns, one column is the SA2_MAINCODE_2016.
The second column would be filled with the column name such as F_Vietnam_0_4, F_Vietnam_5_14, F_Vietnam_15_24 and so on.
Last column would be values from the original F_Vietnam_0_4, F_Vietnam_5_14, F_Vietnam_15_24 column.
Above is the partial screen shot of our transpose output. We can see it immediately make much more sense.
From the above image, we can see that, for each ID of SA2_MAINCODE_2016, that is a country name, and each country name has a range start from 0~4, 5~14, 15~24 and so on.
There is also a letter in front of the country name. This is the gender of the person.
This enables us to do things such as split the gender, country and age group from the Name column.
Therefore make this data set much more meaningful and useable, than in it’s original state.
In this blog we explored the use of Alteryx Transpose tool to transform data.
The Transpose tool is most effective in situations, where you need to get information from the column name, or change the perspective of the data set.