<< Back

Joining data tables in Tableau and Alteryx

How does a join work?

A join operation combines fields from two or more tables in a database and enables us to analyse the resulting data set. The result of a join is a horizontally ‘extended’ data table as the join adds fields (columns) compared to the starting stage. Understanding the different join types is essential for using Tableau. We at The Information Lab are also advocates of Alteryx, an analytics platform providing intuitive workflow for data blending and advanced analytics. As a powerful data personalization tool, Alteryx of course offers all the data joining possibilities.

Whenever I start to learn or explain a concept, I try to find a simple example or analogy and walk through that. Usually what works in small and simple also works on large and complex. Hence we will overview the join types on two small data tables. These are two sheets in an Excel file named PlayerGeography. The first table (Players) contains pool billiard players and their home countries while the other (Geography) lists countries and the respective regions and continents.

Players table                                       Geography table

Players_table                 Geography_table

 

We will join these tables by matching the Country field in the Players table with the Geography_Country field in the Geography table and arrive to a more informative dataset, not only showing player names and their countries but the region and continent the country belongs to. The join could be performed several ways, depending on the logic we want to follow. Let’s walk through these join types and see how they are performed in Tableau (8.2) and Alteryx.

A) Inner Join

It combines only the common records from the tables. The below image is a visual representation of an Inner Join on table 1 and table 2. This image may be familiar from set theory and Venn diagrams.

Inner join image

How can we do this in Tableau? The inner join operation is readily available in the data connection editor window after we dragged the two tables onto the canvas. As Tableau shows us the resulting dataset right away at the bottom of the window, we see that in fact we arrive to the intersection of those two tables.

Inner join

The Join tool in Alteryx has two input nodes, a left and a right part. Notice that it has three output nodes, the L / J / R ones, L standing for a Left Excluding Join, J = Inner Join and R = Right Excluding Join.

Alteryx join tool

This modular setup requires a little bit different thinking compared to Tableau but in case of an inner join the middle output node (J) is just what we need.

Alteryx inner join

B) Left Join

This will include all records from the first table (Players) and all matching records from the second table.

Left join image

This is straightforward to setup in Tableau. It seems that Alex Pagulayan’s country is not listed in the Geography table.

Left outer join

We arrive to the first point where we have to adjust to the slightly different logic of Alteryx. Actually, when working with Alteryx you realize that this is even more user friendly. The three output nodes of the Join tool (L/J/R) altogether add up to a Full Outer Join. So the L and J nodes have to be unioned for a Left Join.

Alteryx left join

 

C) Right Join

A ‘Right Join’ returns all records from the second data table (Geography) and the matching records from the first table.

Right join image

Our example is built on an Excel file, where a Right Join is only available upon using the Legacy Connection. The default Tableau connection does not rely on the Microsoft Jet data engine driver but in some cases (like a Right Join) require the capabilities of the JET driver. The legacy connection can be accessed when we open the Excel data source file in Tableau.

Legacy Connection

For more information on the Legacy and default Excel connections, please visit this link.

With a Legacy connection our task in Tableau becomes simple:

Right outer join

 

We do not see Alex Pagulayan in the resulting data set as Canada is not included in the Geography table. The Alteryx solution is also unambiguous. Building on the example of the Left Join, the Right Join is merely the union of the J and R output nodes from the Join tool.

Alteryx right join

 

D) Full Outer Join

This type of join returns all records from the Players table and all records from the Geography table.

Full outer join image

We hit the first instance in Tableau when performing a join is not merely a matter of dragging data tables to the canvas and selecting the required join type. For a Full Outer Join a custom SQL statement has to be written.

Full outer join not available

Full outer join

Later on for a Left Excluding Join and a Right Excluding Join we will also have to write custom SQL statements and a Full Outer Join syntax has three parts. It is built from the Left Excluding Join + the Right Excluding Join + the Inner Join. The ‘excluding’ joins are achieved by adding the ‘Where’ clause at the end of the Left or Right Join SQL blocks, thus excluding all those records that do not comply with the ‘Where’ clause.

Tableau’s profile is visual analytics, data visualisation, being able to ask questions from your data very quickly. It is a front-end tool. Alteryx is a back-end tool, enabling fast and flexible data personalization, data cleaning, etc. So we expect a solution of just a few clicks on a Full Outer Join from Alteryx.

Alteryx full outer join

 

E) Left Excluding Join

This operation returns only those records from the first table that do not have a match in the second table.

Left excluding join image

In our example this will leave only Alex Pagulayan’s record in the result set. Well, he met no matching performance in 2004 when he won the world pool championship.

Let’s see the custom SQL for Tableau’s data connection:

Left join minus intersection

 

In Alteryx the desired output is simply node ‘L’ of the Join tool.

Alteryx left join minus intersection

 

F) Right Excluding Join

This is nothing special after the getting familiar with the Left Excluding Join. A mirrored image, in every respect.

Right excluding join image

This still requires a custom SQL script in Tableau.

Right join minus intersection

 

The equivalent in Alteryx:

 

Alteryx right join minus intersection

 

G) Outer Excluding Join

Outer excluding join image

In Tableau we can produce this set with a custom SQL statement that unions the left and right parts of the desired set.

Outer join minus intersection

In Alteryx we have to union the outputs from the L and R nodes.

Alteryx full outer join minus intersection

We have discussed joining on only one key field but even if the join is executed on a combination of fields, the logic remains unaffected.

Finally, let’s review the various join type solutions in Tableau’s custom SQL and Alteryx:

Joins - Tableau custom SQL and Alteryx

The topic may have been a bit dry if you are not a data geek as me but I have included a name who is not a pool player. Finding that person in the list may be a playful challenge but to accomplish that you have to ‘Inner Join’ the ‘Players’ table with a list of professional pool players…. (OK, Google will just take you there, too.)

 

 

6 thoughts on “Joining data tables in Tableau and Alteryx

  1. I was lost on how to mimic a left outer join, your post was exactly what I needed to see, thank you for putting this together.

Leave a Reply

Your email address will not be published. Required fields are marked *