At first glance, Alteryx can seem extremely intimidating but I can tell you that it only takes a short amount of time to understand the basics and the rest will come with time.  While there are numerous tools that allow you to do almost anything you need to the data, this blog will go through 5 tools that you can use to format data so it is suitable for whatever you need.

First, let’s start with the page you will see when you open Alteryx:

The focus will be on the canvas, the big empty box where it says ‘Drop tools here’.  This is where you will drop the tools you want to use for your workflow and Alteryx will fill the rest of the page with relevant information.

 

Input Data

The first tool you will need is the input data tool which can be seen in the favorites bar.   By dragging this onto the canvas you will be able to input data using the ‘Set Up a Connection’ button that now appears on the left.

Alteryx allows you to connect to a data source by 3 different methods:

  • Using a file on a local device.
  • Connecting to an external data source (e.g. Microsoft SQL Server, Oracle, Snowflake, etc.)
  • Connecting to a server.

You can also drag a file directly onto the Alteryx canvas and it will create an Input Data tool with the selected data file connected.

 

Join

One of the most useful and easiest tools to use is the Join tool, it saves a lot of time and hassle and has many different options for joining two or more data sets together.  In the example seen throughout this blog I am using the famous ‘Superstore’ dataset and the screenshot below shows the method of dragging two datasets onto the join tool tabs.

There are 3 outputs:

  • The Left Output: All rows from the left table that did not get included in the join.
  • The Join Output: All rows that were able to be joined together.
  • The Right Output: All rows from the right table that did not get included in the join.

You can drag a tool to connect to any of those outputs to further format or join the outputs of the join tool.  This will be seen below when we attach our next tool to the join output.  Within the join tool on the left window you will be able to choose what column to join the tables on and below that you can select which rows will come through in the join.

 

Filter

Another tool with two outputs is the Filter tool which allows you to filter the entire dataset using a formula which can be as specific as you like.  When you drag the output of the join tool onto the Filter tool a new box will appear in the left side of the screen where you can choose your filter.  In this case we have chosen to only see records for Orders in the United States.

To see only fields that pass the filter you will need to drag your next tool onto the T (True) tab on the right side of the Filter tool.  To see records that did not pass the filter you will need to drag your tool onto the F (False) tab.

 

Browse

If you just wanted to see the outputs for each tab of the Filter tool you can drag a browse tool to each and this will show you both a summary and some insights into the data, along with the data itself in the Results window.  The screenshot below shows what will appear in the left hand side of the screen, some insights and analytics into the data, all performed automatically by the Browse tool.  It is extremely useful for a quick look into the data and to make sure there are no errors.

In the Results window at the bottom of the screen, you will also see the data in it’s raw form, it will allow you to investigate yourself and ensure the data has not changed in anyways you did not want it to.

 

Output Data

Finally, now that we have formatted and checked our data we will want to export it or send it back to update the original data source.  To do this we will need to use the Output Date tool.  Drag this tool onto the output of the Filter tool or the last tool in your workflow and it will allow you to export the data.  Once connected the left side of the screen will change to a similar screen as the Input tool.  Here you can set again ‘Set Up a Connection’ to choose where you wish to send the data.  The same three options are available as the input tool, if it is connected to a server or a database it will send the update to the source.  Depending on the file type you will see different options but when choosing an excel file you will see this:

It is important to ensure that anytime you select overwrite instead of ‘Create a New Sheet’ or the similar option depending on your data type that you are sure you want to do so.

 

Conclusion

Your final workflow should look something like this.  If you want to add any other tools into the workflow just drag and drop them between the other tools so that they connect to the data source.  To make sure you’re able to keep track of everything annotating each tool or using comments and containers can be useful to maintain a tidy workflow.

With the five tools used throughout this blog you will be equipped to perform basic data analysis and formatting and create an environment to improve your Alteryx skills.

The data used in this example can be found at: https://community.tableau.com/s/question/0D54T00000CWeX8SAL/sample-superstore-sales-excelxls

Good luck in your future Alteryx endeavors, the possibilities are endless!

Mikael Nuutinen
Author: Mikael Nuutinen