To try to help my understanding of the difference between the “Cross tab” and “Transpose” tool in Alteryx I wanted to write about it. I believe trying to teach or help others understand what was learnt is the best way to learn yourself.
In university, I worked a lot with statistics and with data using a programme called Stata. In Stata I had options to reorganise data which the likes of Alteryx can do, but with less ease and customisability. The Tools were called a bit different, but it helped me understand what I had to do, for example, there was a transpose option: wide-to-long, and a transpose option: long-to-wide. Alteryx does not call them by these expressions, but it does show it through the tool’s symbol.
Cross Tab (Long-to-Wide)
This tool has 5 parts to it:
- Grouping data by value(s)
- Changing the column headers
- Value for the new columns
- Method for aggregation
- The separator used (If string data)
So what does it all mean? If we have a few columns, like this numeric data example, and we wanted to group ages together with how much the different categories made over the whole year, the Cross Tab tool can do this. First Select the “Age Range” as the data to group by, then select the column headers to “Category”, then select the values for the new column as “Average Monthly Sales”, lastly aggregate (or group) the values over the 12 months by sum to get the single-year value. The end result should look like this. What I love about this tool is that there is no “hard coding” so to say, and relies on the user selecting from a dropdown what they would like to do. We could have for example selected an average too.
If we were to have a string data example like here and wanted to see what the name of each president and what office they held for how long, the cross tab tool could also do this easily. First group by president number and office together, so that these key columns stay present later. Then change column headers to “Field type” with “Value” being the value for the new columns. Aggregate data by “First” using the comma separator: “,”. Name is automatically created here, so don’t be worried about where that came from. Also in this example, it did matter how we aggregated data, we could not have done it by concatenation (Joining it together), then we would get 2 names in one cell separated by a comma, if say a different vice-president served under the same president (See Aaron Burr, and George Clinton under Thomas Jefferson). The end result should look like this.
Of course, this data is pre-made with name, dates and values already there for us to transpose, in a later blog post I may show how to combine data sets so to show the true potential of the cross tab tool. In essence, this tool groups a data column and the values of another column by some sort of an aggregation.
This tool has 3 parts to it:
- Key Columns
- Data Columns
- What to do with missing fields
The Transpose tool is also very useful as Tableau, a data visualisation software used at the Data School, likes vertical data better, in the sense that it can be easier to set up and visualise. Using the same numeric data as before to keep things simple. If we select the “Product”, “Category”, “Age Range”, to keep as key columns, they will not be eliminated and used as values. Under the data columns section select the months. When the workflow is run we see “Product”, “Category”, “Age Range”, “Name”, and “Value” columns. We should see this. One could use another tool to sort this data, like sort or filter if you wanted sales above a certain threshold. With missing fields, I have selected the “Ignore option’ as missing fields could indicate a value of 0.
In essence, this tool groups column headers into one row and matches them to specified key columns you have chosen to keep.
I hope the examples helped to make the tools a bit easier to understand. With some datasets, years or time may be individual across multiple columns, whereas to analyse the data we may want it across rows.