Why is my Tableau viz slow?

Posted by on Nov 19, 2012

Yawn!

We all know and love Tableau to be the best BI tool to enable you to quickly and easily see and understand your data. It enables fantastic ‘speed of thought’ analysis, super-fast data visualisation on giant data sets and is often heard in the same sentence with many other superlative prefixed buzzwords. But sometimes, just sometimes, you can get stuck in a rut with a slow workbook if you’ve been overzealous on the features or have configured things in the wrong way. This post is about what you can do to help get your viz back up to the ludicrous speed you expect.

There are several separate areas to focus on when looking to improve the performance of Tableau:

Is your data source up to it?

The bottom line is that a Tableau viz will only ever be as fast as the underlying datasource. It won’t magically turbo-boost an ageing relational database server or put a rocket up your warehouse derriere. Check your views and queries in the source first and if necessary, extract the data to a Tableau Data Extract (.tde). There’s much more info about why you should consider this option in this great article by Tom Brown.

Avoid using custom SQL

When you connect to multiple tables in your database, Tableau writes a query that has been optimised for that data source. Unless you are very experienced with join operation optimisation, writing custom SQL may cause Tableau’s connection to slow down.

Optimise those calculated fields

This is badHere are a few best practice suggestions from Tableau programming wizards. It’s unlikely that any one of these will suddenly solve all problems in a slow viz, but every little helps.

  • Avoid string calculations if you can – they take much more computing time than mathematical operations
  • Use IIF(…) rather than IF … THEN …  Correction: in Tableau 7, both ‘if’ variants evaluate both sides of the condition result – i.e. computes the ELSE side even when true and THEN side even when false. In Tableau 8 this is fixed so in fact IF should be faster than IIF. Apologies to anyone who changed anything based on this tip and thanks to JD and JM for pointing this out
  • Avoid lots of CASE statements
  • Use WINDOW_SUM rather than TOTAL
  • Use MIN(…) rather than ATTR(…) when aggregating a dimension
  • When nesting calculations, create calculated fields separately and then combine them in additional calculated fieldsThis is much better
  • If using an extract, ensure to send the row level calculations to the source by selecting ‘Optimize’

Take it easy on parameters and quick filters

Despite the name, too many quick filters will actually slow you down, particularly if you set them to use ‘Only Relevant Values’ and you’ve got lots of discrete lists. Try and more guided analytics approach and use action filters in within a dashboard instead. If you’re building a view with umpteen filters in it to make it super customizable, ask yourself whether multiple dashboards with different levels and themes would work better (hint: yes, it would)

Don’t over complicate views

Do you really need 50,000 marks on your scatterplot? Are you sure it’s readable? Roll it up and let the user drill down with actions. Don’t go for 10 charts on a dashboard at once, split them out and make use of storytelling across multiple views

Use the right granular level of data

If your database contains 10 years of data at a transactional level are you ever going to present all this fine detail in a viz all at once? Of course not. Pre-aggregate your data and roll it up to days or months.

Alternatively (or additionally) if you are interested in individual transactions then connect to your data with a context filter that only brings back relevant (e.g. the most recent 3 months) data to be used in your visualisation. Don’t forget that even a modest number of rows can be slow if you have column overkill, particularly columns with lots of unique items as this makes sorting slow

Use the latest version

Tableau spend oodles and oodles of cash on R&D and release updates all the time. These aren’t just for obscure bug fixes; they genuinely enhance functionality and speed. Take a look at the release notes to see what you’re missing. Version 8 will also bring plenty of performance enhancing goodies too so keep your eyes out for the release date (currently scheduled for some time in Q1 2013)

Check your server configuration

This is the default setup -- it will work well in many situations

If your workbook works well in Tableau Desktop but is slow when you view it in Server, try these suggestions

  • Make sure you’ve installed Server on a 64bit OS!
  • Have a server dedicated for Tableau and make sure it’s not competing for resources with anything else
  • Tableau Server can run on fairly modest hardware but of course, more is always better. Check the minimum recommended specifications on Tableau’s site here
  • Ensure that you’re caching your data if you don’t need ‘live’ data
  • Have your data source refreshes happen overnight or at times when users are not logged on
  • Drop the timeout for inactive sessions down from the default 30 minutes
  • Identify usage patterns and try some demand management if there are very high peak times
  • Tweak the number of concurrent processes depending on the usage of the server. For example if things slow down when lots of users are logged in, up the VizQL processes. There is a very good article on Tableau’s website that describes different configuration options here.
  • When publishing don’t use the ‘Show sheets as tabs’ option
  • You are using a 64bit OS, aren’t you?

 

Got any more tips to share? Go ahead and post in the comments! Want more info on this topic or anything Tableau related? Contact us at info@theinformationlab.co.uk

17 Comments

  1. A really great post. I hadn’t come across many of the calculation suggestions before, so it was nice to see all of them collected in one place. Is there a reference somewhere that explains why these things help performance? For example, why is a double equal sign better than a single equal sign? I’m also curious about the “show tabs” on server…seeing the tabs is a great advantage for the user, so I’d rather not hide them if possible. But if there is a performance impact, I’m curious to understand it.

    • Thanks for your comment. Most of those calculation tips are general programming principles rather than Tableau specific things. In the case of the double equals, == tests for identical items, whereas = computes whether they are equivalent. The difference is probably negligible unless you’re working with billions of rows.

      The unticking of show tabs means that Server won’t load the whole workbook on loadup. I thought you could get around that by putting in navigation buttons (using action filters) but it seems that doesn’t work. Damn. Only relevant when you’re showing 1 dashboard then.

  2. Hi Robin,

    Great post, very useful resource to have this all in one place.

    One thing we found when profiling our queries against SQL server was that if you have DateTime columns, do NOT adjust the Field Properties to change the type in Tableau from DateTime to Date.

    You might be tempted to do this because SQL server prior to 2008 R2 only has type DateTime and not Date. However, if you do, and then you try and filter on your date, you will see this in your SQL WHERE clause: CAST( FLOOR( CAST( CAST([MyDateTimeColumn] as datetime) as float) ) as datetime)

    This is bad because it means certain indexes can’t be used and your queries are slow. We saw a big speed boost by leaving the data type as DateTime in Tableau, and changing the default format instead to strip the time. That way you don’t get the CAST and FLOOR, and you can use your indexes.

    Thanks,
    Sheel

    • Ah very interesting. That’s a really good point in general there… looking at the logs to see what SQL Tableau is passing back to the data source and killing anything odd.

      Thanks for taking the time to post!

  3. Can you please walk me though me why “Use IIF(…) rather than IF … THEN …” makes Tableau faster?

    Here are my initial thoughts:

    Tableau formula with:
    IF [Profit]>100 THEN "100+" ELSE "Less" END

    Tableau produces a query like:
    SELECT IIF(([Orders$].[Profit] > 100),'100+','Less') AS [none:Calculation_7450117074311365:nk]
    FROM [Orders$]
    GROUP BY IIF(([Orders$].[Profit] > 100),'100+','Less')

    while

    Tableau formula with:
    IIF( [Profit]>100 , "100+" , "Less" )

    Tableau produces a query like:
    SELECT IIF(([Orders$].[Profit] > 100),'100+',IIF(NOT ([Orders$].[Profit] > 100),'Less',NULL)) AS [none:Calculation_6140117074356271:nk]
    FROM [Orders$]
    GROUP BY IIF(([Orders$].[Profit] > 100),'100+',IIF(NOT ([Orders$].[Profit] > 100),'Less',NULL))

    IIF has more logic built in, for the return argument when the expression is unknown, eg null or invalid.

    IF does not have this logic built in, and will produce a query to the underlying data source with less SQL.

    My first thought would be IF is faster to evaluate than IIF, but of course this would require testing to say for sure.

    Why do you think IIF is faster?

  4. Also, just saying “Use WINDOW_SUM rather than TOTAL” is not useful.

    In some cases it is true, and in others it is not. Why not explain why it is faster or slower?

    Also they are very different computations and purposes (agg of an agg or agg at a different level), outside the scope of this post, but I feel like your comparison here ignores that.

  5. And another that I wonder about:
    “Use == (double equals) rather than = (single equals)”

    Why do you think this is so?

    I also wonder about:
    “Avoid lots of CASE statements”

    I normally think of CASE statements as shorthand for IF/ELSEIF statements. If you need the logic of a CASE statement, what would be more efficient? (or course the ideal is to have it be a row level calc, and materialize with optimize, or compute prior to Tableau)

    There are more cases here where your suggestions are not always true, sometimes yes, but you do not mention the exceptions or explain why these suggestions work. I feel this sets someone up for failure when they blindly follow your suggestions and get a slower interaction.

    Most of what you have here is good suggestions, and I hope my comments here are seen in the light of http://mjtsai.com/blog/2012/12/27/the-hypercritical-way/ If I have phrased something poorly, or got something incorrect, please let me know.

  6. Hi Joe, always happy to receive your comments, please keep them coming!

    I made a mistake with the IF vs IIF — IIF will compute both sides of the output regardless of the result of the condition whereas IF…THEN usually doesn’t. However, in Tableau 7, IF…THEN does not have this advantage (according to Richard Wesley’s presentation on high-performance calculations in San Diego). I think this is what confused me into thinking IIF must be quicker. This is apparently fixed in V8.

    Have to give credit to Mr Drummey for also pointing this one out

  7. From my programming days I learnt that a double equals was only ever a test for equality i.e. 2==2 evaluates to TRUE.

    A single equals can also be used for assignment e.g. a=2 so has multiple functions.

    I admit I haven’t evaluated whether Tableau has any difference in the way it differentiates between these two, and if there is it’s likely to be infinitesimal anyway. Probably one more for helping readability instead.

    • What “=”, “==”, or “===” evaluate to depends on the language. In the Tableau calculated field formula editor, “=” and “==” are equivalent. There is no assigning values to variables in Tableau formulas.

      I believe Tableau made them the same so users from from different environments would be more comfortable, “=” for users from Excel like backgrounds, and “==” for those with a programming background. So users can keep their convention and use the style that makes the most sense to them.

      I think it is false to claim that one is faster to evaluate than the other.

  8. My mentions around multiple CASE statements was prompted by workbooks I have been handed that use 50+ conditions on multiple string fields, in multiple custom calculations. If there are that many re-definitions to, do then a joining table in the data source is likely to be a much better option.

    No doubt that CASE is a very useful function and I make use of it all the time when there are simpler groupings to be made.

    Very happy to be corrected, I have made an edit to the article on the IF statement bit. Apologies to anyone who used the tips but didn’t get a result.

  9. Why shouldn’t you use “Show sheets as tabs”?

    • Eric,

      When you use this option, Tableau has to retreive the entire workbook from the repository, rather than just the specific sheet, thus more compute power is required which may not be necessary if you don’t use the other tabs.

  10. I see from your article that you do not recommend using custom SQL.

    What would you suggest as a best practice for extracting data from a Microsoft SQL Server?

    Would you suggest using stored procedures?

    Thank you,

    Connor Singerline

    • Hi Connor,

      The best option would be to utilise the multiple tables option in the Tableau data connection dialogue. Tableau will only use the tables it needs to return the data you have asked for (thru a method called join culling)

      If you can’t use the multiple tables then custom SQL with an extract would usually work OK. In those cases I prefer to write a database view and connect to that thru as the view is easier to maintain and document.

      Robin.

  11. There are some good points here. Regarding the equality operators ‘=’ and ‘==’, I don’t understand why you suggest Use == (double equals) rather than = (single equals). As far as I’ve been able to determine in reading Tableau documentation and doing my own experiments, the two operands are equivalent.

    • Phillip,

      You’re right, there is no difference there, that was a mistake. The article needs to be updated, thanks for the reminder :)

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>