Tableau for Excel Users – Part 1 – Recreating the pivot table
Many people start using Tableau with a great background in Excel. This is not surprising, surely Excel is the most commonly used data analysis tool in the business world today.
I intend this to be the first in a series of posts designed to help new Tableau users with a background in Excel.
Hopefully, these posts will help you take what you know about pivot tables, formula, conditional formatting and more and apply this knowledge to Tableau.
This first post will take us back to the very basics, both with pivot tables, and with Tableau. I will demonstrate how to connect to data, setup the structure of a pivot table, and then introduce data into that table – comparing the differences and similarities between Excel and Tableau as we go.
The examples files I use throughout the post are available for download here:
Step 1 – connecting to data
As always, the first step is to get some data, we’ll connect to the same data in Excel and then in Tableau.
Firstly, lets get started with a pivot table in Excel by choosing some data. I will use the superstore sales data that ships with Tableau – also in the Excel file above.
Once the file is open in Excel, I select the data in the ‘Orders’ tab – from the very bottom right cell of data to the very top left – and then simply choose ‘Insert | Pivot Table’.
After instructing Excel to create the pivot table on a new worksheet, my data connection is made and Excel shows me a blank sheet with areas to drop data fields to create columns, rows or data elements – much like the Tableau interface.
Tableau makes the data connection in a different way. Since the excel file and tableau files are separate, we need to make a connection between the two. Use the ‘Connect to Data’ button once you have opened Tableau.
Then choose ‘Excel’ as your file type, and then browse to the ‘Superstore’ data that shipped with Tableau.
This will be located in your ‘My Documents | My Tableau Repository | Datasources’ folder.
Choose ‘Orders’ to specify that we are interested in the data held on this sheet.
Choose ‘Connect Live’ to tell Tableau you would like updates to the spread sheet to be be reflected in your Tableau visualisations
Your data connection is made once you see the column heading appear in the data window – now you are ready to build your table of data and the screen looks remarkably similar to the familiar Excel interface.
Step 2 – Building the table structure & adding data
To ‘build the table structure’ means to define the rows and columns of the table – in both Excel and Tableau we do this by indicating that certain fields will be responsible for defining the rows, and others the columns. This typically means adding categorical data which ‘breaks down’ the numerical data which is aggregated.
To ‘add data’ means to add the numeric fields that we intend to aggregate. We have to define how to aggregate this data – should we SUM data, or AVERAGE it for example?
To define the rows, drag ‘State’ from the data window on the right into the drop zone on the left hand side of the pivot table – Excel responds by creating one row for each state which appears in the data.
Excel also creates a grand total row by default.
Now ‘Add some data’ by dragging the sales field into the centre of the pivot table, your pivot table should look like this.
Drag state to the rows shelf, or to the drop zone in the left hand side of the table, and drag product category to columns.
Drag ‘Sales’ to the centre of the table and you should see a view similar to this.
However, it’s worth discussing the differences here which make this process somewhat easier to handle in Tableau.
1. Tableau split your fields into Dimensions (categorical data) and Measures (numerical data that can be aggregated) when the data connection was made – making the process of finding fields easier.
2. Tableau’s has a number of default actions which can be triggered by double clicking – try double clicking ‘Sales’ once the categorical fields are in place – this brings sales into the table without the need to drag it.
3. Tableau provides nice formatting by default
4. Grand totals are not added by default in Tableau – to add these, go to ‘Analysis | Total – Show…’ from the menu bar.
Step 3 – Changing the way the data is aggregated
When using pivot tables or data visualisations in Tableau, we are usually AGGREGATING the data. Data may be aggregated by SUM’ing the data or AVERAGE’ing the data.
Both Excel and Tableau allow this to be changed easily.
The default aggregation type in Excel 2010 is to SUM the data. Thus our pivot table displays the sum of sales for the state and product category.
To display the AVERAGE value, simply access the menu as shown below:
To display the AVERAGE value, in Tableau, simply use the dropdown menu on the GREEN pill which is currently displayed on the TEXT shelf.
Step 4 – Introducing a date field
Now lets make a change to our tables. Substituting the ‘Product Category’ field for the order year field.
This seems difficult in Excel. The ‘Order Date’ field is available in the data window, but when this is added to the columns of the pivot table, Excel warns me that there are many different dates contained in this field and asks if I truly want to include them all.
The solution to this is as follows:
Go back to the data sheet (Orders) and add a new column containing a formula to determine the order year. In the spread sheet provided, I have added a field called ‘Order Year (the formula is ‘=YEAR(Order Date)’).
It is now necessary to instruct Excel to include this additional column in the data set being analysed – do this by changing the data source for the pivot table and then re-selecting the whole data set including the new column.
Once this field is available in the data set, the product category field can be removed from the ‘Columns’ area and replaced with the Order Year – creating the pivot table as laid out above.
Simply drag ‘Order Date’ from the data window to the columns shelf – Tableau responds by displaying the ‘Default date part’ – in this case the year.
The default date part is determined automatically by Tableau – if you data spans multiple years, the default part will be years, if it spans multiple months in a single year, Tableau will display months by default.
Step 5 – Customising your Tableau view
I have been demonstrating how to recreate the pivot table simply to make the transition from Excel to Tableau as simple as possible – but I’m guessing you’re not using Tableau just to create tables of data. Tableau can do WAY more that than.
So lets simply start the transition to charting data by making use of the SHOW me functionality.
Make sure show me is visible, then start selecting the various options to see Tableau display your data as a Map, bar, heat map, highlight table and many more.
Hopefully this article has helped take your knowledge of Excel pivot tables and convert it into Tableau-speak. This is the very start of your journey towards Tableau mastery, we will be explaining many other Tableau concepts in Excel terms in further posts in this series.