As you can see by the title this week was advanced Tableau, and the focus was on table calculations and level of detail calculations. We also focused on tableau prep (which is a product similar to Alteryx but with Tableau), as well as a general guide to dashboarding with emphasis on using tiles and containers. And of of course the Friday challenge. The challenge involved being in a team of 3, then each member doing a work-out-Wednesday (WOW) challenge, and then each member presenting their solutions. Furthermore, the only stipulation was that the WOW challenge involve a table calculation or an LOD calculation.

Level of Detail (LOD)

The level of detail or LOD for short, in the context of Tableau refers to the scope or the granularity of the data being analysed. In other words, it determines the level of aggregation or disaggregation of the data in the view. The LOD is important as it will determine the accuracy and precision of the analysis. An example of this would be is, if you are analysing sales data by region and year, the LOD will be at the regional and year level. If you need analysis on a more granular level, then more dimensions can be added like, month or product or quarter. LOD’s come in three varieties: fixed, include, and exclude. A fixed LOD specifies the a fixed level of detail for the calculation, the include LOD adds a specified level of detail to the current level of detail, and the exclude LOD removes a specified level of detail from the current level of detail.

Table Calculations

Table calculations are computations performed on the data displayed in the visualisation. Much like LOD’s they allow you compute complex values that are not present in the original data. Some examples of table calculation are running total, percent total, difference, first, last, and index to name a few.

It is important to note that table calculations and LOD’s are almost interchangeable as they provide similar functionalities. One such similarity is that, they allow you to compute values that are not in the original data set. Another such similarity is that both methods allow aggregation at different levels of granularity.

One key difference between the two is where they sit in the order of operation hierarchy (aka the query pipeline). In other words, there Tableau has a priority que when it comes to what is queried first. In the context of LOD, the fixed LOD is at a higher priority than the include and exclude LOD’s. Furthermore, table calculations are further below in the hierarchy (see Figure 1).

Figure 1: Order of operation or query pipeline for Tableau.

Butterflies

You might now be wondering, how do butterflies fit into this. Well, lets get into it then. Butterflies in this context is referring to butterfly charts. Figure 2, is an example of a butterfly chart:

Figure 2: Butterfly chart.

Butterfly charts are useful for comparing two different data sets, and is often used to show population pyramid, opinion polls and voting results. Figure 2, shows the population divide between males and females according to different age groups in in the UK for 2021. Let’s try and recreate this with the use of table calculations. The dataset being used is from workout Wednesday 2022, week 28. Just a quick note, some cleaning of the data will be required as the data is coming from a .csv which is not in the correct format.

The first step is to drag the ‘Age’ measure onto the rows, and the ‘Males, 2021’ and ‘Females 2021’ measure on to the columns as seen in Figure 3:

Figure 3: Setting the data ready to be transformed to be a butterfly chart.

Next, right-click on the axis for the ‘Males, 2021’ portion, and choose ‘Edit Axis’, as seen in Figure 4:

Figure 4: Edit axis option.

Then in the options menu, select the ‘Reversed’ option as seen in Figure 5:

Figure 5: Selecting reversed.

Now, the graph should look like this (see Figure 6):

Figure 6: Butterfly chart

This is the basics of creating a butterfly chart. To get the different colours like in Figure 2, simply drag the ‘Measure Names’ value on to the colour mark in the male and female section. The next part is very important and it is related to the axis. There will be situations were the axis on one side will be greater than the other, therefore, the results can be very misleading. In order to fix this, we will be using table calculations.

First create a table calculation, and type the following into the formula window, WINDOW_MAX(SUM([Females, 2021])). Then repeat the same for ‘Males, 2021 as well. So, you should have two calculated fields. Next, click on the ‘Analytics’ tab (which is on top of the left-hand pane) and drag the reference line onto the ‘Males,2021’ portion. Then in the reference line menu, as seen in Figure 7:

Figure 7: Reference line configuration.

In the ‘Value’ option choose the equation written for the ‘Females, 2021’ variable. And  for the female portion, the ‘Value’ option will be for the equation written for the ‘Males, 2021’ variable. This way the axis will be fixed, and it will be much more of an accurate representation of the data.

That’s it for this week. So, we have discussed LOD’s and table calculations, and how they can be used to create a butterfly chart. Next week will be a bit more interesting as we will be discussing API’s, macros, and regex. So, stay tuned for that.

The Data School
Author: The Data School