Table calculation in Tableau has 5 rank types in its arsenal, and often the different types confuse users in their decision to choose the one they really need. This post is an attempt to clarify the differences between each rank. Each rank type will be described with examples here.
RANK()… yes, just RANK
RANK() is probably the first function which a Tableau user investigates when he needs to range their data. RANK can be defined as in this post:
RANK gives you the ranking within your ordered partition. Ties are assigned the same rank, with the next ranking(s) skipped.
It means having numbers sorted in order, the rank will be assigned in descending order by default for each number. Furthermore, if there are two or more same numbers, then the next rankings are skipped by the number of the repeated values. If the ranking needs to be in ascending order, the user has to specify it. The function interface is RANK(expression, [‘acs’ | ‘desc’]), where expression is aggregate over a field and instead of [‘asc’ | ‘desc’] write either ‘asc’ or ‘desc’. Note that ‘desc’ can be skipped because it is a default value. Below is an example. The data set used for the examples is publicly available on the Internet and called Sample – EU Superstore. The RANK() function is applied to the unique number of products the customer purchased during the considered period.
We can see that the lowest rank (1) is assigned to the highest number of products (35) in the rank descending column. One can also note that if there are repeated numbers, then they have the same rank. However, the following rank continues with the rank equal to the rank number of the repeated values plus the number of times the repeated values occurred. Basically, the two repeated values 32 have the same rank of 3, but the next value 31 is ranked as 3 + 2 = 5. For the ascending order, the rank starts with the total number of rows and then decreases. In addition, it is interesting to note that the missing gap is before the repeated values in the ascending case.
Note: sorting in order was performed here and below only for simplicity. In real calculations, sorting is not needed.
RANK_MODIFIED() is very similar to RANK(). However, the gap is not after the repeated number but before. For the number of the repeated products 32, we can observe the gap in rank before them. So, we can see that the missing rank is 3.
Taking a close look, we can see that in opposition to the RANK() function, the gap occurs in the rank_modified column after the repeated values.
RANK_DENSE() function is similar to the two previous functions with one exception. RANK_DENSE does not have the gaps as it presented for the ascending and descending orders below:
RANK_UNIQUE() assign unique rank values despite repeated values in the # of unique products columns. Furthermore, this rank type does not have any gaps.
RANK_PERCENTILE() is the last and probably the most confusing type of rank function. Let’s first recall what percentile is. As it stated in mathsisfun :
Percentile: the value below which a percentage of data falls.
On PALLIPEDIA, the percentile is defined as:
A percentile (or a centile) is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall. For example, the 20th percentile is the value (or score) below which 20% of the observations may be found.
For simplicity, consider numbers from 1 to 100 in columns row index on the picture below, and they are in order. We want to apply percentile to this column. If we pick the row index equal to 20 then 20% of all numbers are below or equal to 20. This is valid for every percentile in this table. If we split out percentiles by 4, we obtain so-called quartiles (25%, 50%, 75%, and 100%). Similarly, deciles split the data into 10% groups, but this is not the scope of this work.
The same concept can be applied to any column with different values. For instance, it can be # of unique products in our original table. So, applying the rank_percentile function, we obtain the following table for ascending and descending order:
Note that for RANK_PERCENTILE(), ‘asc’ is a default value, and you can see that the rank can be repeated for this function.
In this work, we learned about 5 different rank types in Tableau. We have seen that the variety of ranks gives you flexibility for solving different problems. RANK(), RANK_MODIFIED() or RAND_DENSE() can be used to ignore the repeated values assigning the same rank to them. RANK_DENSE is used when you need to take into account the repeated values by given to each value its own rank even if the numbers are repeated. Finally, one can rank values by precentile using the RANK_PERCENTILE() function.