Overview on Alteryx

As you can tell from the title already, we started training in Alteryx. I was a little worried about my training especially when it came to the certification, and weather or not I could actually understand using the program and the tools as required. But by the end of the week, I like Alteryx…a lot actually. There is something about the puzzle-box nature of it that is somewhat addictive, where you need the data to look a particular way, and then you go and use the multitude of tools to get it to look in that particular way.

Alteryx Challenges

Quite possibly the highlight of the week were the Alteryx challenges throughout the week, and the Friday which was out presentation day (which we will get to). These challenges can be quite frustrating due to the nature of questions. And at the same time, also rewarding when you get the solution that matches the target output. The benefit of doing these challenges is that it forces you to use the tools in very different ways. Furthermore, it reinforces what each tool does and how they can be chained together. All of this culminated for our challenge on Friday.

Alteryx Friday

The day finally arrived where we had to present, something. We were unsure as to what we needed to do, and was pleasantly surprised to find out that it was to do Alteryx challenges. The task was to pick an easy and intermediate Alteryx challenge, solve to get the target output, and present how the solution was reached. I picked challenge #21, which was just a simple date-formatting challenge. The difficult challenge for me was challenge 33. This challenge was essentially reshaping Nielsen reporting data from a 733-rows to 30-rows. Let’s have an overview as to how they were done.

Challenge #21

This challenge was a simple date formatting challenge. So, this is the start data (see Figure 1):

Figure 1: Starting data.

And the end goal is this (see Figure 2):

Figure 2: Expected output.

So, let’s have a look at how it was solved. I came up with 2-solutions. The first solution was the one I presented to my cohort and the second solution was me trying to figure out if there was an easier way of doing it. Let’s start with the first solution:

The first step after inputting the data was to parse the ‘Date’ column using the ‘Text To Column’ tool to separate month and the year, and using a formula tool to fill in the null values in the year column, as seen in Figure 3:

Figure 3: Parsing the data, and using a formula tool to fill null values.

Then the needs to be split in two. One half will be for the year ’07’, and the other will be for the year ’08’, as seen in Figure 4:

Figure 4: Splitting the data into two.

Furthermore, this was done to create a column with the name of the month. It is not possible to assign a month with only the first letter alone, therefore, a record ID was assigned to each record then the ‘Formula’ tool was used to convert the number into a month. Finally, the ‘Union’ tool was used to connect the two streams of data as seen in Figure 5:

Figure 5: Union tool.

The end result is the target output outlined in Figure 2. There is an alternate solution which is much more simplified. The process is the same up to using the multi-row formula tool. It differs in the sense that the data does not have to be split into two. As seen in Figure 6, the tile tool allows the data to be evenly split, and as a result a number is attached to each year, which can then be used to assign a month.

Figure 6: Tile tool.

Challenge #33

The intermediate challenge was significantly more challenging, it consisted of reshaping Nielsen reporting data, into something more readable. Figure 7, is an illustration of what the input data looked like:

Figure 7: Input data.

The data had over 750 rows of data however, the vast majority of it was useless. So, the data needs to be in a readable state before any kind of analysis can be done. The first step was to start removing rows/columns that were null, had no useful information, and general clean up. This can be seen in Figure 8.

Figure 8: Cleaning up data

The next step was to filter the data into two streams as was done in challenge 21. So, one stream would be date, and schedule time while the other would be viewership information based on different demographics, as illustrated in Figure 9:

Figure 9: Filtering data into two streams.

The final step was to join the two halves and then do a final clean-up. However, before a join can be done a common filed was necessary, and neither half of the data had a common field. As a result, the ‘Record ID’ tool was used to give a unique record to each and then joined on that record which can be seen in Figure 10:

Figure 10: Joining the two halves of the data.

Finally, the data was narrowed done to just 30-rows, as shown in Figure 11:

Figure 11: The final data output.

Conclusion

Our first foray into Alteryx was fun and challenging. Will I miss it for the coming week? Yes. And I am looking forward to more Alteryx in the coming week. The one upside is that we can keep doing Alteryx challenges to keep our skills sharpened and learn new skills as well.

I will see you next week, as we head into more advanced Tableau.

 

The Data School
Author: The Data School