Data blending is a functionality in Tableau that combines two or more datasets. It is very important to understand key concepts around data blending to ensure that data is used correctly.
For this article, we will use two-sample excel datasets, namely, customer_details and profit.
Let’s connect to profit excel sheet to the Tableau workbook. After adding your first sheet, add your second sheet to Tableau workbook which can be done by either pressing Ctrl+D or by simply clicking on the new data source icon as shown below and adding the customer_id sheet to the Tableau.
Once both the sheets are connected, now is the time to introduce the first important concept.
1. Primary data source:
The data source of the first field that you add to the view becomes your primary data source. This adds a blue tick to the primary data source which distinguishes it from the secondary one.
2. Left-inner Join:
The importance of a primary data source is that based on a common field, the other sheets will be joined to the primary one using a left-inner join where the primary data source will be left the table.
3. Linking Field:
Data will blend on the common field (in our case customer_id). Locate the linking icon by selecting the secondary field. This is your common field on which the data is blended. Note the blue color next to the field which indicates that linking is on. Click on the linking icon to turn on/off the link. Now the color of the linking icon will change to grey. Also, an orange tick on a secondary data source will only appear when you bring any field into view.
Note: Your primary data source and the secondary data source are worksheet-specific. if you open a new worksheet on Tableau, you will have to again define your primary and secondary data source by following the steps mentioned above.
4. Linking data manually:
From the Toolbar, select Data dropdown and then select ‘Edit Blend Relationships’. Click on Custom bullet and then click ‘Add’. Here you can add the field manually.
5. Level of Aggregation:
Your primary data source will define the level of aggregation shown in the view. For example, let’s drag customer_Id from customer_details sheet to rows and profit field from profit sheet to text. Here the level of aggregation is up to the customer_details (primary data source) level. Since cust_04 did not order anything, there are no profit details and hence has no value displayed. But since it is in the primary data source sheet, it will still appear in the view after left-inner join. Also, for each customer whose profit data is available in the profit sheet, it will be aggregated to the customer_Id level.
I hope these few key concepts will help in understanding blending in Tabaleau better.