Dealing with time series is a huge part of data analytics. A date field that’s well formatted and consistent will typically transition seamlessly into whatever reporting tool you’re using. A date field that is consistent, but incorrectly formatted, can normally be fixed up with a little bit of extra time, annoyance, and regex. It’s likely, however, that you’ll come across a date field that is both inconsistent AND formatted incorrectly.
If you’re not a coder or regexpert, this is where Alteryx once again saves the day.
The DateTime tool
Alteryx’s DateTime tool is nothing short of supernatural. This is a tool that will parse your awful, ugly, date fields and return something truly beautiful: yyyy-MM-dd hh:mm:ss. That is, it will return ‘2019-05-17 23:39:00’ for 11:39 pm on Friday, the 17th of May, 2019 (which is the date and time that I’m writing this sentence).
In fact, a date field could contain the string “Friday, the 17th of May, 2019 @ 11:39 PM” and the DateTime tool would still parse it. It’s that good.
Configuring the DateTime tool to parse strings
Because it’s particularly awful, I’ll use the string mentioned above in my worked example, but any format can be parsed.
Let’s start with an ugly text input containing a variety of dates and times:
For the DateTime tool to work it needs to be configured for the exact format that you’ll be passing it. After connecting the DateTime tool to your input, the configuration window pops up like so:
The labelled option/info boxes (a-f) are as follows:
a) Select whether the tool is converting from a string to a date or vice versa.
b) Select the string (or date) field to be converted and enter the name of the output field.
c) Select the format that exactly matches the field that will be converted. (In our case we need to enter a custom date so ‘custom’ is selected).
d) Specify the custom date format of the input field.
e) An example of what your input field should look like will appear once you’ve indicated a format. This is really handy for getting it right before running the workflow.
f) An example of the output is shown. It will either be yyyy-MM-dd or yyyy-MM-dd hh:mm:ss depending on whether your input is a date field or date-time field.
When specifying a custom format for the input (d), you can normally choose one from the list and build on it. In our case, we have:
day, the dd** of Month, yyyy @ %I:mm %p
The two *’s act as a wildcard stipulating that two characters that vary (th, st, nd, or rd) will follow the date. The %I is used instead of hh in our custom date because we have 12-hour time, and the %p stipulates that what follows our time is an uppercase ‘AM’ or ‘PM’. Lowercase ‘am’ or ‘pm’ would require an uppercase %P (not the most intuitive syntax, but it works well).
After entering the input format and hitting Run, we get a squeaky clean date-time field:
Notice that even though our day of week abbreviations weren’t consistent (Thur/Thurs, Fri/Friday), the DateTime tool still parsed them correctly. This will also be the case if you have inconsistency between 3-letter month abbreviations and the full month name (Sep/September). Furthermore, you can see above that our string contained multiple time formats because a zero preceded a single digit in records 1, 2 and 6 while it’s absent in records 2-5.
This flexibility is an added benefit to an already outstanding Alteryx tool. Using a single DateTime tool does, however, have its limits – what if our date field contained formats that were wildly different? Read on for a worked example.
Parsing multiple date formats in one field
Building on the previous exercise, let’s add a few more date formats into the mix:
Now we have our previous 6 entries in one format, plus 4 additional formats to parse. This can be done in a fairly easy, albeit unsophisticated, way by dragging 4 extra DateTime tools to the canvas and configuring a different format for each one:
Unioning these and then filtering out the nulls will give the desired result. The caveat to this approach is that you must be aware of all the formats in your date field. Furthermore, any kind of repetitive tool dragging in Alteryx is generally not advised as there’s probably an easier way.
Hopefully though, you’ve gathered an idea about how flexible and convenient the DateTime tool is for parsing messy strings and outputting usable date fields!