My troubled relationship with Regex (Regular Expressions) turned a corner this week after being introduced to the Regex Pal + Alteryx Regex tool by Peter Goldsworthy.

Learning Regex is a bit like learning a second language – except you don’t speak it – you just think it.  It comprises lots of .‘s and +‘s and / and (‘s – that when all put together describe a pattern in a string of text.  I’ve seen it used to great effect in various scenarios, tried it out a bunch of times, but have never really grasped it.  This week I saw how truly powerful it can be when combined with the Regex Tool – in Parsing mode.

Peter did some Alteryx teaching demos during The Data School Australia Week 1. He showed how we can use the Regex Pal tool to test and refine a Regular Expression before pasting it into the Alteryx Regex tool.  He showed how putting brackets around a pattern string creates what are called capture groups.

These capture groups then combine with other string matching to create a series of discrete string captures.

It is then quite an easy thing to paste the Regex into the Alteryx tool. Set the tool output method to parse, and then assign the output field names.

Run the workflow, and then ‘Hey Presto’ a cleansed set of data!

 

Craig Dewar
Author: Craig Dewar