Intro to Sets
Most people will have had some exposure to set theory in high school. In mathematics, a set is just a well-defined collection of distinct objects, whose order or arrangement in the set does not matter. This theory is applied and performed in Tableau through the creating sets function. In Tableau’s language, sets are custom fields that define a subset of data based on some condition, either created manually or computed. Note that sets are indicated by a small Venn diagram icon.
There exist numerous use cases with sets. Since sets automatically separates members into two groups, namely, IN and OUT, we can use sets on the Filters card to only look at data relevant to the selected group. Alternatively, we can drag sets onto Color or Shape on the Marks cards to assign different colors or shapes to members that are in or out of a set. For sets that are created using the same dimension, we can go one step further and create a combined set to find members that meet multiple criteria.
As you can already imagine, sets are very powerful features inside Tableau that enable users to slice and investigate their data with more flexibility and perform deeper analysis, thus easing the process of extracting insights. As a result, it is crucial to have a sound understanding of how they work and how they can be used. Next, I will utilize the sample superstore data set to demonstrate the use cases which I just mentioned.
Creating a set in Tableau is very straightforward and two common methods are often used. First, go to the data pane, right click a dimension and select Create -> Set, then you will see a dialog box pop up. Take state for example. This is how the dialog box would look like.
The General tab allows you to manually select items you would like to include in the set, but you can also use the Condition or Top tab to specify rules that determine which items to include in the set. For example, the Top tab allows you to find the top 10 states by total sales. Another method for creating sets is manually selecting items once you have built a chart and create a set from there, like below:
The difference is that the first method allows you to create a dynamic set which will update when the data changes whereas method two creates a static set that remains unchanged even when the underlying data changes. Thus, it is more recommended to use the first method.
For the state set, let’s just pick some random states using the General tab, say those beginning with letter C (i.e. California, Colorado, and Connecticut). Let’s continue to create two more sets but this time using customer name instead. One set will give us the top 10 customers by sales, and the other top 10 by profit.
Using Sets in the View
Now we have three sets created, let’s move on to practice the use cases we just mentioned. First, let’s create a US map showing all the states. Then drag the state set onto the Filters card. This will automatically filter the map down to the three states in our set as follows:
Second, on a new worksheet, let’s create a scatter plot showing sales against profits for each customer. All we need to do is dragging sales onto Columns shelf and profit onto Rows shelf, and customer name onto Detail. To utilize our two top 10 customers sets on the Marks card, drag the one for sales onto Color and the one for profit onto Shape. Note if you don’t see Shape on the Marks card, click on the drop-down button inside Marks, there you will find it. The results we will get are two distinct colors and two distinct shapes for each of the sets, which you can customize. My view looks like the following after a bit of customization:
Creating Combined Sets
As I mentioned earlier, we can create a combined set from two sets if they are based on the same dimension, which is the case for our two customer sets. If you right-click on any of them, you should see an option Create Combined Set. Click on it and this dialog box pops up.
Depending on your analysis needs, you can decide which members to include in your combined set. This is exactly the same as using a Venn diagram to include or exclude certain items. In my case, I selected option two, which will give me the customers that belong to both the top 10 sales and the top 10 profits sets. Afterwards, the combined set can be used just as any other sets we’ve created. In other words, you can use it as a filter, on the Marks card, or use directly on the Rows/Columns shelf.
I’m sure by now you should begin to appreciate how useful sets are. If used correctly, they can be a great time-saver and facilitate more in-depth analysis. More importantly, having a solid understanding of how sets work is fundamental for learning set actions in Tableau, which are an immensely powerful feature inside Tableau and the mastery of which opens up a new world of possibilities for your data story-telling. So why not open up Tableau and do plenty of practice with sets until you get the hang of them. And stay tuned for my next blog about a beginners’ guide to using set actions to significantly boost the flexibility and capability of your dashboard.