One of the great things about Alteryx is because of its click and drag tool nature you can quickly test workflows to test theories or prove a concept on the fly. Recently, I needed to quickly come up with some fake data to test a workflow, I came across this method of creating data which is quick and easy.

Why might you need to do this? I can think of a couple of reasons this might be useful:- You may not have access to a final production data set and might just want to get a draft concept working before switching over to your master data. You may be training other people in Alteryx and you don’t have quite the necessary sample data to hand to demonstrate something with. You may need to create scaffold data within a workflow to supplement your actual data in order to solve a data modelling problem. You may want to share a piece of work but the real data is just too sensitive so a fake placeholder dataset could help. The concepts detailed below will be useful in many scenarios, not just generating fake data.

To demonstrate this I’ll start by dragging two Text Input Tools on to the canvas and I’ll fill them with some data, then I’ll connect them to an append tool.

Next, I need to generate the numbers, to do this I simply use a formula tool and the Random Integer function. You may wonder why I used the random integer function rather than the random function, and this is because the random function gives you a random number between 0 and 1 so all numbers are decimals, whereas with the Random Integer function, other than all being integers, you can specify a maximum value, which gives you a little extra control over your values.

A quick cross tab of the measure names then creates a useful normalised data table that I can work with.

While there are certainly ways to create much more advanced dummy data (see mockaroo.com), this is a great method for when you need something quick and uncomplicated.

 

 

The Data School
Author: The Data School