How to recover a broken Tableau Server with no backup (2018.2+ Edition)
This is an update to a previous post I wrote, which described how to recover your data from a broken Tableau Server if you don’t have a backup to hand. Please refer to my previous post if you are running Tableau Server version 2018.1 or earlier. If you are running Tableau Server version 2018.2 or later, then keep reading.
As discussed in my previous post, this is a last resort option. Please exhaust all other methods to try and get yourself a backup first before trying this. When I say backup, I mean a proper Tableau Server backup file (.tsbak) generated using TSM at the command line interface or Web UI. If you don’t know what I’m talking about then read this.
In this scenario, we’re going to assume that your server is in an unstable state, and that for whatever reason you are unable to use TSM to create a backup file. Your actual Tableau Server could still be running fine, but since TSM is decoupled from the Tableau Server processes and runs independently, it can still be in a bad way even if the actual application is not. How could this possibly happen you say? It could occur after a botched install/upgrade; or permissions could have not been set up correctly – did you know that the Tableau Server Run As user now needs to be a local administrator on the server machine?; or it could be something preventing one or more of the six TSM services from starting correctly.
So let’s say that for what0ever reason this happens for real and after trying all possible troubleshooting steps you are left with no choice but to attempt a manual recovery. Here’s what you do. One last caveat, I offer this advice with no support. What you do to your server is on you. If you do find yourself in the unenviable position of having to undertake this recovery and you don’t feel comfortable following these steps, then contact Tableau Support, and they will help walk you through this process. Better still, set up a separate throwaway server and practice the steps until you feel comfortable with them, then apply them in your actual environment.
- Find and run the stop-administrative-services script to kill all the server processes. Note this script only seems to exist on the Windows release in 2018.3, but looks like it will be available in the 2019.1 Linux release, as it’s available in the beta version already. You can find this script in: <tableau server directory>/packages/scripts.<version number>/
- Monitor your server processes using Task Manager (Windows) or top (Linux) and wait for all the Tableau related services to die off. By the end your Tableau server application should be completely stopped, allowing us to safely copy files.
- Rename the Tableau application/data folders. Specifically, you want to rename the “Tableau Server” folders in C:\Program Files\Tableau\ and C:\ProgramData\Tableau, or if you installed to a separate drive, rename the original install folder location, e.g. D:\Program Files\Tableau\. On Linux, the default locations are: /opt/tableau/ (for application files) and /var/opt/tableau (for data). Renaming these folders will preserve them when we run the obliterate script in the next step.
- Run the tableau-server-obliterate script. You’ll find this script in the same folder as the script in Step 1, although you’ll have renamed the parent tableau server folder, but go ahead and run it anyway. Make sure you run this manually from the command line – double-clicking on it won’t do anything – and be sure to follow the instructions to get it to execute properly. This will completely uninstall Tableau Server and remove any environment variables, but since we renamed our folders, our data is preserved.
- Uninstall tableau server from add/remove programs – if you’re on Windows, you might need to take this extra step to deregister Tableau Server from Windows.
- Run the Tableau Server installer – make sure you install the exact same version that you were running previously. Simultaneously attempting an upgrade will fail. Don’t do it!
- You should be prompted to select an install location (if you do not, or the installer tells you there is already another installation present, then go back and run the obliterate script again – properly). Select the location where you had originally installed Tableau Server. Do not select the renamed folders.
- Choose to create a new tableau server installation
- Install server with same authentication method and service account as used previously. Be sure to specify the fully qualified domain name for your Run As User if it is a domain user. Also make sure that user is added explicitly to the local administrators group on your server.
- Wait for the install to complete, coffee time.
- Once finished, you will be prompted to create a new admin account in the Tableau Server interface and login. Wait and make sure all samples have finished publishing before continuing to the next step.
- At this point, you should be able to access the TSM Web UI on port 8850 and use TSM at the command prompt. Stop the server using either method.
- Now here comes the recovery: we are going to replace some files and folders in the new installation with the files and folders from the old, renamed installation. Go ahead and rename the “pgsql” and “data engine” folders in the new installation directory. You should find these in <Tableau Server data directory>/data/tabsvc/
- Find those same folders in your original, renamed installation, and copy them over to the new location. We have now copied all our data over from the old, broken install. From this point onwards, we will work in the new install location only.
- Modify your pg_hba.conf file. You’ll find this file in the following place: <Tableau Server data directory>\data\tabsvc\config\pgsql_<version>/pg_hba.conf
Change “md5” to “trust” for the user “tblwgadmin”
from: host all tblwgadmin <address>/32 md5
to : host all tblwgadmin <address>/32 trust
16. Repeat this for all the lines that contain the “tblwgadmin” user
17. Regenerate the internal security tokens:
tsm security regenerate-internal-tokens
tsm pending-changes apply
18. At this point, your server should have started, and you should be able to login to your Tableau Server after a few minutes. If you find that some of the content on the server is missing from the web interface, then you’ll want to reindex the search engine:
tsm maintenance reindex-search
19. Confirm that the tsm commands (ie: stop/start/status) all function as expected
All finished? Server back up and running? Great, you can wipe that sweat off your brow! Now is a good time to do a ‘tsm maintenance backup’ and store that backup somewhere safe!