What’s the difference between 23 and 1? If we’re talking about time, it’s 2.
A cyclical variable is a fancy name for a feature that repeats cyclically. I’ll write a future blog post on the importance of feature engineering, but in short, it aims to improve the accuracy of a predictive model. It’s vital to ensure a model interprets cyclical features correctly as the two-hour difference between 23 and 1 would otherwise be interpreted as -22.
Types of Cyclical Variables
Wind direction, seasons, time, days (of a month, year, etc.) are all cyclical variables. A more general rule of thumb is anything is cyclical in real life (wind direction), repeats (seasons) or has an important denominator (days of a month or year). Categorical and continuous cyclical features can be treated similarly.
The Cyclical Formula
Here’s the general formula to convert a variable into a set of cyclical features:
Note that this will mean creating two features.
Examples
We’ll look at two examples; how this formula works with analogue clocks and then on the more practical application of wind direction.
Let’s look at a 12-hour clock at precisely 6 o’clock (unlike the clock above).
max(a) is 12 as you can’t have a number higher than 12 on your typical clock. It’s important to note that 12:01 am is the same as 00:01, which needs to be taken into account with other cyclical features. If max(a) isn’t the same as 0, then add 1 to the max (see below with the wind example).
Now let’s say my apartment has a north-facing window which I keep open and I want a predictive model for how cold my house will be when I get home. My data shows that a northerly wind will make it cold, and so will a northeasterly, whereas a southerly wind will have no effect. If we number the features 0 to 7, a model will treat higher numbers as having less of an impact based on this data.
But happens when there’s a northwesterly? Intuitively, this would also make my house cold. That’s why we need to capture the cyclical nature of the feature!
Northwesterly Wind
Northerly Wind
As always, thanks for taking the time to read my blog! If you have any comments, suggestions or want to chat, free to connect with me on my LinkedIn!
~ Ryan Edwards