The first time I saw RegEx it looked like a bunch jargon and nonsense. Even after the process was explained to me, I couldn’t understand why anyone would bother to learn a complicated computer language. But after a couple of weeks of learning the tools and tricks of Alteryx, I have an understanding on how RegEx is a useful tool for parsing data and isn’t as complicated as I first thought.

Determined master the language, I began with trying to understand the basics.

. – any character except a new line.

Here you can see that any character has been selected

\w \d \s \n – word\digit\space\new row. Adding a + after collects one or more of those characters in a row. For example below is \w+ , \d+ and \s

(abc) – is captures characters you want.

\.- gives the full stop character

Just from having a better understanding of these couple notations I was able to read and collect information from the txt file.

The below Regex reads as follow

Group one or more digits up until a space. Then group a word + space + another word, till there is a new line. Then group a digit, after there are multiple spaces. Then group a character ( being the negative sign) + one or more digits + full stop + one or more digits, after there is multiple spaces. Finally group one or more digits + full stop + one or more digits.

 

Alice Rooney
Author: Alice Rooney