What do data and coffee have in common? Quite a lot when you think about it! For example they both: 

  • Are not consumable in their raw state
  • Will require some altering 
  • Can be highly addictive

In our first week at The Data School Down Under we began using and learning Alteryx, a platform we can use to clean, prepare and alter data so it is ready to be consumed. Although it is yet to be able to do that with coffee, from what I have learnt so far of Alteryx its applications can be related to the process of making coffee.

What is Alteryx?

It would be difficult to describe what Alteryx exactly is. You can refer to The Data School’s David Ruhnau’s post for a more in-depth look on Alteryx it is and its capabilities. 

The (most basic) way I like to see Alteryx and its workflows is that it is like a factory with assembly lines. At each tool, the data is processed in a way that makes it slightly more usable than the last eventually leaving us with cleaner data to work with. 

How I relate data to coffee

As a former barista, I am using coffee as a rudimentary example to relate to Alteryx. Coffee begins its life as the coffee fruit but the fruit is not what we consume in our coffee beverages. Data is similar in that the data we obtain isn’t always quite ready to be used for analysis.

Let’s pretend this….

…is equal to this

 

 

 

 

 

 

In our coffee scenario, what we want from the coffee fruit is the coffee seed (aka the coffee bean). We do this by cleaning, pitting and drying the seeds. We roast can roast the seeds/beans with a roaster, grind it with a grinder and, extract the espresso shot with a coffee machine for consumption. Note that we require different machinery for each step of coffee extraction.

We grind this…   ⟶

…filter the coffee through this… ⟶

…to get this beautiful coffee

 

 

 

 

 

 

 

Similarly, in our data example we may only want specific data from a field. Here we want words within the “genres” categories such as ‘Animation‘, ‘Comedy‘ and ‘Drama‘. This is where Alteryx comes in with its tools. At each step of data processing I used a different tool isolate the words. The tools I used were: select, parse, cleanse and multi-field formula to separate and obtain the desired strings (aka the words).  The end result was much cleaner categories to work with.

From this data…  ⟶

… and using these Alteryx tools…  ⟶

… we can retrieve more clearer data.

 

 

 

 

 

My thoughts on Alteryx

I thoroughly enjoy using Alteryx (admittedly I really enjoy documenting and organising the workflow) and each new workflow I create gives me a sense of accomplishment. I am finding that by completing the weekly challenges, or looking up the solutions to the challenges and retrying the challenges, I gain a clearer understanding of how the tools work.

This post might have taken a rudimentary view on both coffee and data, but I hope the simplification helps those individuals without a data background relate to data a bit more. 

If you would like to contact me, please feel free to connect with and message me over at LinkedIn.

Thank you for reading and have a great day!

Andrew Ho
Author: Andrew Ho