Looking for an Alteryx Beginners Guide? Have a read below for some tips and tricks on Alteryx beginner use!
I have data to explore and fix!
Palette: Aqua Tool: Input data
Scenario: You have a dataset and need to load it onto Alteryx.
General: Data sources include files (eg an excel spreadsheet), websites (eg a table within a website) and databases (eg the data school server).
Loading a file: simply drag and drop the file or find it’s directory under the data connections tab
Loading a database: go to saved sources under the data connections tab, manage connections and select “add connection” (eg Microsoft SQL). From there fill in the blanks, using your provided login and host information, but make sure to “test” the connection before selecting the database. Select your new database from your saved list to connect to it.
Tip: You can select multiple files at once, or an entire folder/files with similar names by using the wildcard (*) in the directory.
SOOoo.. did it work??
Palette: Aqua Tool: Browse
Scenario: The data has been inputted and you need to explore it for issues, OR a workflow has been run, and you need to see how it turned out.
General: This tool is great for eyeballing your data at any point in the workflow, by selecting a column header you can see what it’s field contain, and if there are issues (eg some, but not all, fields have a capital or punctuation marks). You can see the quality of the data (ie how much is missing) and can help direct you to your next tool to be used in the workflow.
Tip: Ctrl Shft B is a shortcut to add in a browse tool. This is more useful than the “output” tab located in each tool, as it shows ALL the fields.
All the columns, all the weird names and that’s not actually a string.
Palette: Blue Tool: Select (or Auto Field)
Scenario: You have unnecessary columns, need to rename headers or change a few field types (eg strings to integers).
Tip: If you have numerous columns that require a change of data type, use Auto Field. But if you have only a few and still need to perform the other tasks (ie removing columns and changing header names) then use the Select tool.
I don’t really care about this part of the data..
Palette: Blue Tool: Filter
Scenario: You only need to focus on a particular section of the data, based on a particular field category.
General: This tool has many options for filtering data. You can fill in the basic filter options, which are based on IF THEN functions. Or you can formulate your own filter(s). The resulting data is based on TRUE or FALSE outputs, meaning you can work on both outputs at once or only focus on the relevant dataset as needed.
Who’s the fairest of them all?
Palette: Blue Tool: Sort
Scenario: Your data needs to be sorted by particular fields.
Tip: for those like me who get these mixed up… Ascending is going UP (or small to large) Descending is going DOWN (or large to small).
I need to do this thing, but it’s not a standard tool.
Palette: Blue Tool: Formula/ Multi-field Formula
Scenario: You may have a particular action you need to perform on a field or add in a new column based on an interaction among other fields.
General: this tool can achieve so many things, it’s somewhat impossible to summarise here!
Examples: Joining two columns together to form a single field: “[Field1] + [Field2] outputs [Field1Field2] OR Categorising columns as “less” or “more” than another column: “IF [Field1] < [Field2] THEN “Less” ELSE “More” ENDIF”.
Ahh yes, multiple files from the same data set because it was too large? Wonderful..
Palette: Purple Tool: Union
Scenario: When data was too large for a single file and has been split into separate files OR you have manipulated the data into different workflows and they need to be put back together.
General: this tool joins ROWS together, resulting in a LONGER dataset. Headers can be arranged manually to match between datasets if they are not aligned, showing those that do not line up in the process.
Tip: Work from left to right when manually moving column headers, as they push columns to the right disrupting already corrected columns..
Get it together, Carol.
Palette: Purple Tool: Join/Join Multiple
Scenario: You have multiple datasets or files that need to be joined together, matched by at least one column.
General: This tool joins COLUMNS together, resulting in a WIDER dataset. There must be a common column identifier between the datasets for them to match up (unless you select join based on record position). The output consists of those columns that JOINED (J, middle) and the left over right (R, top) and left (L, bottom) columns that did not join.
Copy and Paste THIS to all the rows.
Palette: Purple Tool: Append Fields
Scenario: When you need to provide a column with more information, but for every row.
General: You may have split data at some point or are joining new data and it needs to be added across multiple rows.
Tip: We are kind of joining AND unioning at the same time, so we are WIDENING and LENGTHENING the data. Every current row will have the new column(s) added, so no matching ID/column header is required like joins/unions.
Do I really need that as a column??
Palette: Tangerine Tool: Transpose
Scenario: You have columns that could be better analysed as rows (particularly for Tableau, I’m looking at you survey data).
General: This tool is useful for moving columns into rows. If you want to organise your data more logically, arrange the potential axis better for further analyses (such as graphing) or arrange a matrix as a table, this is the tool to use.
Tip: We are changing the data from WIDE to LONG.
These rows don’t make any sense!
Palette: Tangerine Tool: Cross-Tab
Scenario: I want data to be viewed in a single row for particular columns (particular for API workflows, I’m looking at you JSON parsing).
General: This tool allows us to move data into columns that are currently in rows. It gives us a single row view of selected information (“values for new columns” option) according to particular columns of our choosing (the “group data by these values” box).
Tip: the columns you DON’T want to change (ie the current headers) are the ones selected in the “group by these values” box. Everything else will be calculated as per your selected parameters.
TOOO many rows!!
Palette: Tangerine Tool: Summarize
Scenario: You have a very high level of detail of your data and too many rows as a result.
General: This tool helps us perform basic calculations on columns, reduce the number of rows we have by changing the data level of detail (ie grouping data based on certain qualities, called aggregating) and many more functions.
Tip: If you want to aggregate data, but still include all its other columns, “group by” the highest level of detail and all others until its new lowest level of detail you have chosen.
That’s not a date, THAT’S a date.
Palette: Light Green Tool: DateTime
Scenario: Converting strings of data to date types.
General: Data is not usually in the correct format for identification as a date/time in Alteryx. So we need to correctly format and change it using this tool. Using this parsing tool is just one way of achieving this.
Palette: Light Green Tool: Text to Columns
Scenario: Separating data within the same field into multiple columns/rows.
General: You need to choose a single delimiter to separate your data and the number of columns/rows you want as an output. You can also select to delete any extra data or keep it concatenated. Unlikely as the name suggests, you can split into rows OR columns.
I hope this guide has provided some insight into beginner use of Alteryx! If you found my guide helpful, you might like my other blogs, check them out HERE.