Select Page

As someone who loves precision and completeness, I was a little surprised recently when I came across scenarios where I wanted to introduce some randomness into my data world.

In my previous blog I discussed creating mock data, and in a recent project, I needed just that. I was working on a proof-of-concept piece. I had start and end values but needed to fill out the data in between in my dataset. It was a perfect place for some clever use of random numbers.

In Alteryx, the RAND() function returns a random number greater than or equal to 0 and less than 1. There is also a similar function, RandInt(n). This returns a random integer between 0 and the specified parameter. For example, RandInt(100) returns a random whole number between 0 and 100, e.g. 79.

## An example of RAND() in action

In my case, I wanted to create a relatively linear relationship between my two variables. Because I needed that general relationship, I couldn’t just fill everything out with random numbers! However, what I could do, is use RAND() to my advantage and add some variability to a linear relationship.

So, after making a mock data set with a linear trend, I dragged a formula tool onto my Alteryx workflow and got to work. I decided I wanted my data points to be within +/- 5% of their value in a perfectly linear trend. To achieve that, I first needed to create a random multiplier within that range.

My formula to achieve this was ((RAND() – 0.5)/10).

(RAND() – 0.5) this gave me a number from -0.5 – 0.49.

Dividing that by 10 gave me my target, with a result between -0.050 – 0.049 (technically -5% to +4.9%, but I could live with that!).

From there I simply used this to calculate my multiplier using:

Multiplier = (1 + ((RAND() – 0.5)/10))

New Value = Old Value * Multiplier

And that’s it. My perfectly linear trend now had some variability and was prefect for my proof-of-concept work!

Random functions can be added into complex formulas to create some really interesting relationships, depending on your use case. They’re also excellent if you’re working with statistics or concept scenarios with known outcome probabilities (coin flips, for example).