The first time I saw RegEx it looked like a bunch jargon and nonsense. Even after the process was explained to me, I couldn’t understand why anyone would bother to learn a complicated computer language. But after a couple of weeks of learning the tools and tricks of Alteryx, I have an understanding on how RegEx is a useful tool for parsing data and isn’t as complicated as I first thought.
Determined master the language, I began with trying to understand the basics.
. – any character except a new line.
Here you can see that any character has been selected
\w \d \s \n – word\digit\space\new row. Adding a + after collects one or more of those characters in a row. For example below is \w+ , \d+ and \s
(abc) – is captures characters you want.
\.- gives the full stop character
Just from having a better understanding of these couple notations I was able to read and collect information from the txt file.
The below Regex reads as follow
Group one or more digits up until a space. Then group a word + space + another word, till there is a new line. Then group a digit, after there are multiple spaces. Then group a character ( being the negative sign) + one or more digits + full stop + one or more digits, after there is multiple spaces. Finally group one or more digits + full stop + one or more digits.