Tableau has a number of ways to support visualisations and data analysis using multiple data sources – this includes ‘joining’ data and filtering from one view to another – but todays subject is DATA BLENDING.
This post will explain the concept of data blending from the perspective of an Excel user, and is part of a series of posts to help Excel users get the best from Tableau.
If you are looking for earlier posts in this series:
If you’re looking for a step by step guide to implementing data blending in Tableau – start with this 12 minute video from Tableau
So what is Data Blending?
Imagine we have two data sets – one provides the transactional data for our business, the other provides the budget information. In this scenario, we would say that we wanted to ‘Blend our sales and budget data into a single visualisation’ – the intention of this example is to understand whether our sales figures are on budget or not.
When we do this using Tableau’s data blending – the steps that are happening are (see the Tableau video to see HOW to do this):
1. A connection is made to both data sources independently
2. The sales data is aggregated to create some view – sales amount by state for example
3. Tableau then reaches out to the secondary data source and finds budget fields for the categories that are used to draw the view.
This can be known as a POST AGGREGATE JOIN as the join to the second data source is not made until after the aggregation is complete.
What’s the equivalent of data blending in Excel?
When I teach Data Blending, I often use an Excel example to introduce the subject, the equivalent is easy to understand and uses familiar techniques.
To replicate the Tableau example above in Excel, follow these steps (or download this Excel file and take a look)
1. Create a pivot table which aggregates data by state – this is the equivalent of building a view in Tableau
2. Add some budget data to a second worksheet in Excel – this is equivalent to connecting to a second data source in Tableau
3. Create a VLOOKUP function from a new column adjacent to your pivot table, and lookup the budget value using the state name
The final step in this Excel process is the equivalent of the data blending step in Tableau.
Get the Excel file here.
What can we learn from this approach?
Thinking about data blending in this way has helped me understand a number of things about data blending:
1. When using data blending it is NECESSARY to have the field(s) that are common between the two data sources IN MY VIEW. It it not obvious why this is the case in Tableau, the Excel example makes this clear – how could the VLOOKUP possible work without the State field that is involved in the function?
2. It is clear that this approach is different from a database JOIN – Tableau users often mis understand the differences between joining and blending of data, in this example we can see that the sales data is aggregated BEFORE the secondary data set is used. This is different from a traditional database join in which the data is joined at the record level – before any aggregation.
3. It becomes apparent that blending can be a very efficient (from a performance perspective) method for bringing additional data sources. Since the data is joined after the aggregation, this could result in a very ‘lite’ operation for Tableau, when compared with joining data at the record level
Additional learning resources for Data Blending: