5 things I wish I knew about Tableau when I started
From day one of using Tableau, I found it to be a fantastic and easy way to visualise my data and produce great reports and analysis. 3 years on, I have come to learn and understand a few of the more advanced features and concepts about Tableau which have really enhanced the way I use it today. If I could visit myself 3 years ago and teach 5 Tableau concepts, this is what I’d cover.
Green vs Blue
Green data fields are continuous and blue data fields are discrete. Tableau behaves differently depending on which of these different types of fields are used in a view. I’m not going to go into too much detail because there is already a great article here that explains these differences really well, but it’s worth mentioning again that blues will give headers, categorical colours and multi-select filters while greens will give axes, gradient colours and range filters. It’s so fundamental to the way Tableau draws stuff on the screen, if you don’t understand what these differences are then I encourage you to read the article thrice before returning!
Partitioning and Addressing
Table calculations are one of the most powerful tools within Tableau, but they’re also one of the most complex. When I started making use of table calcs, I was mostly guessing what to do to get the right result. By learning some of the theory, particularly what is meant by ‘partitioning’ and ‘addressing’, has helped me understand the what, where, when, why and how of Table calcs.
Tableau’s manual says:
“The addressing fields define what part of the table you are computing along. The partitioning fields define how to group the calculation”
To me, I don’t find that statement all that helpful. Let’s try and reword it into plain English where we can apply it to a table calculation we are working with in Tableau. If the partitioning fields “group the calculation” we could start by saying ‘per customer’ or ‘per product’ or ‘per ship mode & container combination’ etc.
The addressing fields are those used within the calculation you are doing, so we could continue by saying ‘calculate % of total for each region’ or ‘calculate the difference for each category’. Put these two statements together and we get something that resembles a plain English sentence.
In the example above we get “For each Region, calculate the Percent of Total for every Category”. For me, reading aloud in my head like this helps me know how to set up my table calc.
Note: When using the ‘Compute using >’ menu shortcut, this will set whatever you select into the addressing (“for every”) box, and every other dimension used in the view to the partitioning box.
Get lots more info on how to make Table calculations work for you right here and here
Tableau writes a query language as you drag and drop
Tableau is pretty clever software and incorporates all kinds of breakthrough technologies that allows you to quickly create complex visuals from huge data sets using a simple drag and drop interface. But at its heart, Tableau is talking to your data using a form of SQL, and then shapes the results from your data source onto the screen through an ‘interpreter’. Having an appreciation of what’s happening under the hood helps you to drive her in the most efficient and elegant way. Put a dimension on rows and a measure on label and you’ve written a query along the lines of
SELECT Region, Sum(Sales) FROM Orders GROUP BY Region
Put another dimension on the filter shelf and it adds a WHERE clause. Do some sorting and you get an ORDER BY etc. Why is this important to me? Well, whenever I get stuck and don’t know what fields to use or how to configure them, I ask myself ‘how would I do this without Tableau?’ i.e. what steps would I follow to get the required result in a database or a spreadsheet? Working through such a solution often helps me discover the missing link I need in Tableau.
Of course, if you’re not familiar with databases and SQL then you can still become an expert Tableau user without understanding any of this but if you already have a bit of data analysis background then this can help accelerate your Tableau knowledge. Check out the log.txt file in the My Tableau Repository directory to see what’s going on in the background.
Order of operations
When you add fields to your view or to the filter shelf or perform a custom calculation, it appears that all these items are computed simultaneously. In fact, things are done in a certain order in Tableau and knowing this order can help you construct your view so that you get the results you need.
Things are processed in this order:
Context filters create a temp table in your source
Top N and/or conditional filters form part of your SELECT statement in the query
Standard filters are applied as a WHERE clause
Aggregations are computed
Table calculations are applied
Table layout and axes are drawn
Anything on the Pages shelf is taken into account
Marks are then drawn
Knowing that a standard filter comes after a top N filter but before a table calculation can help you get out of situations such as you don’t know why your % of total figure is not working
Use of the INDEX function (and it’s close relatives)
Although I had looked through all the available functions early on in my quest to become an accomplished Tableau user, and was aware of the functions INDEX, FIRST and LAST, I hadn’t made proper use of them until relatively recently. Now I know what they do, I use them all the time to help accomplish my goals.
INDEX essentially creates a rank, whether that be by the order that your items are display on the screen or by any other measure is entirely flexible and allows you to sort, filter and display your data in ways that are otherwise not possible.
Here are a couple of introductory examples of what you can use these special functions for:
Create rankings within different categories
Prevent overlapping text when using a TOTAL or other table calculation
There you have it. There’s a lot to take in there, but I think my 3 year younger self would have appreciated the tips. How about you, reader? Was there a Tableau ‘Eureka’ moment for you? Please share in the comments below or get in touch at info@theinformationlab.co.uk
Since top N filter are 2nd in line, and the where clause 3rd (no idea why they did that this way…), is there a way to use the index() function to keep only the top N products on a continuous line chart?
I’ve managed to keep the top N values using this formula for a calculated field as a filter:
1 <= index() AND index() <= [Number]
which will return True for the values whose index are between 1 and [Number] (inclusive)
They had it in that order because that is the order that it appears in the SQL statement.
Very good and helpful article. Who wrote it?
A simple way to summarize the difference between Blue versus Green is a Dimension versus a Measure.
I disagree–you can have a dimension that is continuous or one that is discrete… in other words, you could have blue dimension or a green dimension. Same with measures.
Exactly. So A simple way to summarize the difference between Blue versus Green is Discrete versus Continuous.
Very interesting things you put out here. Good explanation on green vs. blue.
Please reload the images on your page. I would like to view the content in its entirety. Thanks
Thank-you for sharing these ideas – particularly the addressing vs partitioning explanation.
I am in a position of “three years younger version of you”.
So, I will make the best it.
I especially like your explanation of Partition/Addressing.
Thank you for sharing the ideas especially liked your Partition/Addressing point.
The partition/addressing explanation is fab!
and the overall idea of sharing your main learning points even better.
Thanks!
this link “Prevent overlapping text when using a TOTAL or other table calculation” no longer works, can you please add it back. This article is great need to master these items. I am exactly stuck on some weird behavior for index.