Introduction


If you are reading this you may have aggressively wanted to look for how to create a funnel plot in Tableau (like I did), or stumbled upon it by accident. Either way I will show in a few steps how to create a functioning funnel plot in Tableau with no blends, joins or trickery. In only 16 calculations and a few more steps, you can have your very own funnel plot.

The most likely place you will find a funnel plot is in a medical journal or anything to do with medicine and statistics. It doesn’t have to be about anything medical though. If you want to explore your data to look for outliers then the funnel plot may be a better alternative to the box plot.

 

 

 

 

 

 

 

 

 

First, you need a dataset. Perhaps a column of American states or something by which we can split the data. Also, you will need 2 columns of data that we can plot on the axis. For example, I have used infant mortality data for the USA (found here). This dataset includes the rate of deaths per 1000 live births and the number of deaths. I have manipulated the data slightly to add another column to actually calculate the number of live births. My dataset is pictured above. 

I disliked knowing that a state was doing badly in terms of mortality rate but I wanted to know if the results were significant or not. In statistics, we usually say that something is significant if the result we obtain is further than 2 standard deviations from the mean. That is if the data falls outside of 95% what we expect the value should be, then we draw attention to it. The funnel chart is an easy way to highlight the outliers and to compare the outliers against each other.

Calculations

The next step is to put the data into tableau and to create some table calculations
1. Denominator: MIN([live births])
This is going to be on the Y-axis for our chart
2. Max Denominator: PREVIOUS_VALUE(WINDOW_MAX([Denominator]))
This is used to create padding for the upper and lower limits
3. Min Denominator: PREVIOUS_VALUE(WINDOW_MIN([Denominator]))
This is used to create padding for the upper and lower limits
4. Numerator: MIN([DEATHS])
5. Max Numerator: PREVIOUS_VALUE(WINDOW_MAX([Numerator]))
6. Min Numerator: PREVIOUS_VALUE(WINDOW_MIN([Numerator]))
7. Bin Increment: PREVIOUS_VALUE(ROUND(([max denominator] [min denominator])/1000))
This increment is very important. I put the number “1000” to make the upper and lower limits smoothe. If the value is too low then the edges of the limits will look jagged and rough.
8. Calculated bin: (INT([denominator]/[Bin increment])*[Bin increment])+[min denominator]
9. Proportion: ([numerator]/[denominator])
10. Mean value: PREVIOUS_VALUE(WINDOW_AVG([Proportion]))
11. Standard Error
IF [Calculated bin] == LOOKUP([Calculated bin],-1) THEN
PREVIOUS_VALUE(0.0)
ELSE
SQRT([Mean Value]*(1-[Mean Value])/[Calculated bin])
END
12. Lower 95%
IF [Standard Error] == LOOKUP([Standard Error],-1) THEN PREVIOUS_VALUE(0.0)
ELSE [mean Value]-1.96*[Standard Error]
END
13. Upper 95%
IF [Standard Error] == LOOKUP([Standard Error],-1) THEN PREVIOUS_VALUE(0.0)
ELSE [mean Value]+1.96*[Standard Error]
END
14. Lower 99%

IF [Standard Error] == LOOKUP([Standard Error,-1) THEN PREVIOUS_VALUE(0.0)
ELSE
[mean Value]-3*[Standard Error]
END
15. Upper 99%
IF [Standard Error] == LOOKUP([Standard Error],-1) THEN PREVIOUS_VALUE(0.0)
ELSE [mean Value]+3*[Standard Error]
END
16. Significance
IF [Proportion] < [lower 99%] THEN
“Very Low”
ELSEIF [Proportion]< [lower 95%] THEN
“Low”
ELSEIF [Proportion]> [upper 99%] THEN
“Very High”
ELSEIF [Proportion]> [upper 95%] THEN
“High”
ELSE
“Normal”
END

Visualisation

Those are all the steps one needs for the funnel chart.
Next is where to put those calculations.
1. Add “Denominator” to the columns
2. add “Proportion” and “measure values” to the rows and make them a dual-axis, and synchronise axis.
3. For the measure values, you will want to keep only the lower 95% and 99%, the upper 95% and 99%, and the mean
4. Turn the measure values into a line, and the proportion into a circle mark.
5. Add significance onto the proportion marks’ colour
6. Add state or whatever you want to split by to the details for both, and calculate only the measure values and significance by state

You can download my workbook and follow along with these calculations, here.

Big thank you to Jonathan Drummey, as I borrowed a lot of the techniques from his visualisation.

This chart could have been made by pre-making the upper and lower limits in a data preparation programme like alteryx, but as not everybody has it I decided to try to make my own way of doing it.