It has now been four months since I first learned that there was a data visualisation tool called Tableau. Like a kid who has just been given the newest gadget on the market, I started exploring what I could do with it and immediately it blew my mind away.
‘Just drag the variables to shelves to quickly build any visualisation’, someone told me.
‘What?! Just like that? Who was the genius that thought of this?’, I asked.
Yes, Tableau is very impressive at first sight. But like any new thing, I thought it would quickly become just another dull tool. However the exact opposite happened – here I am, four months on, and Tableau can still surprise me almost on a daily basis.
Some other more experienced users may be inclined to think ‘Four months? That’s nothing! I’ve been using it for years.’
Well, fair enough. But my only long-lasting relationship with any data tool up until now had been Excel. And we all know how that ends.
I’m now more experienced with Tableau, and there are lots of things I wish I knew when I first started to use it (or when I was applying for The Data School). There are heaps of blogs, books, and videos online about Tableau tips and tricks that have helped me along my learning path, so I’ll leave some references along the way.
Discrete vs. Continuous Fields
This was one of the things I struggled the most with – Tableau treats data differently according to whether it’s continuous or discrete. The blue/green colour coding was another thing that I didn’t grasp for a while. It turns out it doesn’t represent dimensions/measures but instead distinguishes variables that are discrete (blue) from variables that are continuous (green).
In most cases, Tableau will assume dimensions as discrete and measures as continuous, hence my confusion. I learned from Ryan Sleeper in his book Practical Tableau, (which I recommend if you’re starting out on your own) a good way to know when you should use each type of variable: discrete fields draw headers, while continuous fields draw axes.
Don’t forget about dates
A very good way to visualise the difference is by using dates. They can either be discrete or continuous and there are cases where that makes all the difference.
Looking at ABC’s journalist Will Ockenden’s metadata, let’s say we want to do some exploratory analysis and look at his calls per day.
(If you’re not familiar with this dataset, Will is an ABC journalist who accessed his phone metadata and released it, asking people to give it a go a see what they could find out about his life. As it turns out, it’s a lot.)
When we use the field Date as discrete, we can see all calls made by Will between April 2014 and March 2015, with a very prominent peak around Christmas 2014. By selecting a discrete field, Tableau goes up to the highest level of granularity in the data, which in this case is the day.
An inexperienced user (like I was) could be led to think that the dates were an axis, but when we try to access the axis editing pane, there’s nothing there – that’s because it’s a header, not an axis. Another way to confirm that is to see that the label ‘Date’ appears above the chart, instead of below.
When, on the other hand, we use Date as a continuous field, this is what we get.
The granularity of the data is still day because we didn’t specify otherwise. However, the dates are now appearing as an actual axis. Notice also where the ‘Date’ label appears now below, instead of above, the chart.
And the data itself looks different – we can now see some gaps in October and February, where there are no records of phone calls (which makes sense, as Will was out of the country). This is also a crucial thing to take into account when dealing with continuous or discrete data.
There’s an easier way to do this
Like any inexperienced user, I was dragging the variables to rows and columns with a left click of the mouse. Well, if you don’t know it yet, this next piece is going to be a revelation. Try right-clicking and dragging the variables and this is what you’ll see.
Before dropping it on the shelf, Tableau allows you to choose how you want to plot the Date variable. In order, you can choose it to be either continuous or discrete (in this case, Tableau will default to the highest level of granularity); levels of granularity as a discrete variable; aggregations as a continuous variable; levels of granularity as a continuous variable.
The same principle applies to all other variables, which will show you the different options available.
Control how your chart looks with hidden reference lines
Whenever you’re dealing with continuous variables, it’s sometimes difficult to control precisely how your chart looks as Tableau sometimes tends to decide for you. In this case, I’m building a ‘small multiples’ viz, plotting different Alpacas data in the US for a span of 10 years (a subset of data from the Iron Viz challenge), and this is what I have.
Things are not looking that good even though I applied some nice formatting. There’s a lot of white space at the bottom and none at the top. I could just edit the axis and choose to remove the zeros, but that’s even worse.
Here’s where reference lines can help!
From the analytics pane, just drag a ‘Distribution Band’ onto the chart area, drop it over ‘Pane’. Select Value and then Percentage, and set it to a range that works for your chart. Make sure Label, Line and Fill are all set to ‘None’.
And you now have a beautiful looking small multiples viz.