Like the title and above image suggest, this entry is about using Tableau to visualise maths. And yes, I know that for most people maths is not the most exciting topic. But before you go running for the hills, consider sticking around – visualising mathematical patterns can be beautiful, and, in Tableau, challenging. Before I delve into the topic of visualising, I must first introduce the mathematical problem at hand.

The Collatz Conjecture

The topic which I recently visualised is the Collatz conjecture. It is in essence a simple mathematical problem. Choose any number. If it is even divide it by two. If it is odd, times it by three and then add one. Then repeat. The conjecture is that, given enough repetitions, all origin numbers will eventually reach one, at which point they will be stuck in a loop. One is odd, so 1 x 3 + 1 = 4. Four is odd, so 4 / 2 = 2. 2 / 2 = 1. And we are at 1 again. This process is what gives the problem another common name – simply 3n + 1 (hence the above image).

Sounds simple, right? In some ways it is, and in some ways it isn’t. This simple algorithm has produced one of mathematics’ most famous unsolved problems. I won’t go into any more details on the problem here – check out the viz linked above for more information. (I am also vastly underqualified to explain perhaps one of the most difficult maths problems ever conceived).

Using Tableau to Generate Data

This challenge was interesting, because at least for one chart on my viz, I used Tableau to not just visualise, but also entirely generate the data. Because the visualisation is just of an algorithm, I can use Tableau to perform the algorithm and create the data! Below is the chart that I will explain.

Now, when I say the data is completely generated in Tableau, there is one caveat. Tableau requires a data source to create a visualisation. I have used a dataset I named ‘100000rows.hyper’ – it is just what its name suggests. One hundred thousand rows, each with an index, created in the Alteryx Generate Rows tool. You can use almost any data source as long as it has unique rows. A dataset of only ones won’t work, as the table calculations we will use require a distinct dimension. However, apart from this requirement anything will do!

One Big Formula

Now to show how to actually create the chart. Most of the work is done in one big formula (as the heading suggests). This may seem intimidating (or it may not) but the work that is actually being done is very simple. Without further ado, here is the formula:

To parse out this formula into an explanation, I will start from the middle. Firstly, I have created a parameter called Collatz Initializer – this is an integer starting value (it can equal anything you want by default). I will refer to this as [CI] for the rest of the blog when I type out code, just to save some space. Right in the middle of the formula is the section:

(else)if PREVIOUS_VALUE([CI])%2 =1
then PREVIOUS_VALUE([CI]) * 3 + 1

This section is relatively simple. The PREVIOUS_VALUE() table calculation just takes the previous value of the current column. Its parameter input is the value that it should take when there is no previous value, i.e. it is looking at the first row. In this case we use the Collatz Initializer. So, it looks at the previous row, unless there is no previous row, in which case it looks at the Collatz Initializer. % is the modulus operator. It divides by the second number and takes the remainder. So in other words, the first line of code says: IF the previous value divided by two has a remainder of one. Or, more simply, IF the previous value is odd. The remaining lines are simple. If it is odd, times three and add one. If it is even, divide by two. It is these three lines of code that do the hard work, but there are a few technical details left to solve.

Some Technical Details

Firstly, if you remember from earlier in the blog, when numbers from the Collatz conjecture reach 1, they get stuck in the 1-4-2-1 loop. This calculation also gets stuck in this loop, and performs it for the remaining rows in the data (in my case all one hundred thousand of them). For this reason we add the condition:

(else) if PREVIOUS_VALUE([CI]) = 1 then null

So, if the algorithm has reached 1, set to null. A null value here will terminate the algorithm and cause all further iterations to also return null. In the chart, these nulls will not be visualised.

The second problem is that this algorithm currently starts with the second number in the sequence, not the initializer itself. To remedy this, I use:

if INDEX()=1, then [CI]

So, if it is the first row of data, use the Collatz Initializer instead of performing the Collatz algorithm as described above.

Building the Chart

With these steps, building out the collatz conjecture viz is very simple. Just drag the Collatz calculation to rows. Make a new calculated field equal to INDEX() and drag it to columns. Drag any dimension to the details on your marks card. Set both the Collatz calculation and your INDEX() calculation to compute using your chosen dimension. And that is it! If you show your parameter you will be able to change it to change your line chart.

A More Complex Visualisation

The above chart is a more complex application of the simple process used in the Collatz Initializer chart. Instead of one Collatz thread, this chart shows many. Each strand in the image is a branch of a tree, which starts and one and shows the branching paths all numbers (so far) take to get there. The trick to the above chart is that whenever a number is even it rotates the thread 8 degrees anticlockwise, and when it is odd the thread shifts 8 degrees clockwise. The result is a natural-looking structure that explores the complex paths that numbers take to finally reach 1.

Drawing this in Tableau requires some trigonometry and an algorithm that is significantly more complex than the simple application I explained earlier. While I initially tried to implement this purely in Tableau, this algorithm proved to be a little too tricky for me to build out – generating data from nothing isn’t really Tableau’s intended purpose. Instead, I generated most of the data in python, and did a conversion from polar to Cartesian coordinates in Tableau – this kind of transformation is covered in more detail in my previous blog. If you are interested in how this data was generated, you can find the code and a results csv here.

And that’s it for this blog! For everyone who stuck around despite my warning of the presence of maths, I hope it was worth it – for me, this is an obvious and beautiful paring.

The Data School
Author: The Data School