Random Adventures in Tableau
Before we get into the meat of the blog I wanted to give you a short test: see if you can guess where the data used in the below vizusalisation came from. I removed the axis labels to make it harder. I’ve also highlighted one series, but at random, highlight another and see if you can work out the dataset…
[tableau server=”public.tableausoftware.com” workbook=”RandomAdventures-Part1″ view=”Dashboard” tabs=”no” toolbar=”yes” revert=”all” refresh=”no” linktarget=”” width=”600px” height=”620px”][/tableau]
Stocks and Shares, right? You know what that share who’s dropping is? That high flyer? Now hit F5 to refresh your internet page, watch the data….
I’m sorry to say that this was all randomly generated data, not stocks and shares, I made it all up. Each line started off at the same value and I gave it 100 random movements (1 point up, 1 point down or no movement – all equally likely) before I showed you on the chart. Want to check? Here’s the axes and full chart (the above chart starts at x = 100)
[tableau server=”public.tableausoftware.com” workbook=”RandomAdventures-Part1b” view=”Dashboard” tabs=”no” toolbar=”yes” revert=”all” refresh=”no” linktarget=”” width=”600px” height=”620px”][/tableau]
So next time you’re telling yourself you’re onto a sure fire stock market winner, or a “can’t lose” streak at the roulette / blackjack table (Vegas TCC 15 anyone?) then just double check that you’re not looking at random increase. I find it amazing how different each of these lines of data is after just 100 generations of randomness.
It’s randomness in Tableau, and specifically generating random numbers that I want to explore in this post.
Why Generate Random Numbers in Tableau?
Before we get into the HOW, lets explore the WHY. The main reason for introducing randomness into the dataset might be to “jitter” data-points in the view. Steve Wexler of Data Revelations has already written on this subject, and I recommend his excellent article to see details of one approach. However another approach, where using INDEX() isn’t appropriate might be to use a random number. We’ll visit one particular use case later in this article.
Secondly you may wish to model processes that include a random probability or chance, if you do then obviously random numbers offer an approach.
Thirdly, you may just want to have fun, and say…build a blackjack game in Tableau.
How do you generate Random Numbers in Tableau?
Aside from methods using RAWSQL functions or SCRIPT functions to call out to SQL/JET and R respectively then there is no function that will allow you to bring a random number into your Tableau workbook. Instead you’re going to have to use a random number generator such as a Linear congruential generator (LCG) – these are pseudo-random number generators that are incredibly simple as they have linear algorithms.
You can read about LCG’s here, and I will also show you an implementation stolen from the Tableau genius that is Joshua Milligan (author of the Blackjack game referenced above – I advise you check out his Tableau Public visualisations).
The actual random number calculation is recursive – so the calculation takes its previous value as one of the inputs – using Previous_Value – a table calculation:
Random Number (one method – many variants involving different values exist)
((PREVIOUS_VALUE(MIN([Seed])) * 1140671485 + 12820163) % (2^24))
To create a Random Integer
INT([Random Number] / (2^24) * [Random Upper Limit]) + 1
[Random Upper Limit] is a parametrized upper limit for the calculation.
DATEPART('second', NOW()) * DATEPART('minute', NOW()) * DATEPART('hour', NOW()) * DATEPART('day',NOW())
In the calculations above, [Seed] could be anything to start of the series but I have chosen a completely random seed based on the date and time, this ensures a different random number series each time. I could have used a fixed number or a parameter to control the series, or give the user control, if we follow this route the same [Seed] will generate the same series of random numbers, allowing repeatability.
Implementing a Random Number for “Jittering”
To show you how to implement jittering I want to return to an old post of mine, Health Check your Data using Alteryx and Tableau. In that post I showed a “DNA” profile of data, I want to now show you an alternative method using Jittering. in this case using INDEX() wasn’t appropriate as the Row Number was quite possibly related to the data type. So I use a random number.
Though here I used another version of the LCG formula (just for fun):
I also created a seed and integer version as I detailed above, then I added the integer version as a continuous row after my existing Column names. Hiding the axis of this “jitter” row then left me with what I needed (after some formatting) – click below to see the jittered result. Backwards engineer the viz to see the exact details (this is a highly recommended way of learning).
More Random Adventures
Doing random stuff in Tableau is how I get my kicks, and so also to add a bit more randomness to this post here’s a video of a vizualisation I built I call “Tableau Life” – I’d like to think this is how new features get propagated in Tableau 🙂
As a bit of fun while you’re watching this try and guess how many rows of data were used in making this visualisation – answer at the bottom of this blog.
Building this visualisation was a challenge and fun but perhaps a little complicated to explain as part of this post, it’s probably enough to say I took inspiration from the amazing and inspirational Noah Salvaterra and his amazing Fractal images in Tableau. Take a look at his blog post to explore his methods, mine are fairly similar (if less advanced).
My approach was to create two sets of random number to check for + and – movement (or none) on each axis (equal chance of each one). Then to iterate across an X value – for the path – and a Y value – for each “node”. The result was a set of random walks – which you can play with and recreate here:
[tableau server=”public.tableausoftware.com” workbook=”RandomAdventures-Part2″ view=”Dashboard” tabs=”no” toolbar=”yes” revert=”all” refresh=”no” linktarget=”” width=”600px” height=”820px”][/tableau]
How many rows of data? The answer is only 2! Here’s the dataset I used to create both random datasets in this post – the random “stocks and shares” and the “random walk”.
If I’ve just blown your mind then I suggest your read Noah’s post and download my workbooks and backwards engineer them. Welcome to the world of Tableau – the rabbit hole just got deeper 🙂 Any questions on this – I have left it unexplained as it is a little off the beaten track for most Tableau users then please tweet me @ChrisLuv or comment below.