A quick guide to In-DB tools Part 1
In Alteryx we have In-Database tools. This allows users to connect directly to your database. The aim of this is that it lets you do ETL and calculations in the source system without moving the data out of it.
The In-DB tools let users visually build their SQL query, which is processed in the database. Therefore, when writing your formula or filter the data in the In-DB tools only the SQL syntax will work and not Alteryx-specific functions.
How do In-DB tools work?
The first thing a user does is to manage their connections to the database in question. Manage In-DB Connection lets the user edit, build and view existing / new connections:
Or you can do it this way:
Following this, you need to input the credentials of your connection and select the connection type if you are unsure about your credentials then talk to IT.
The Connect In-DB tool will always be the first tool you drag in your canvas. This is where you connect to the database and the table that you want to connect to and build your query. When you write out your SQL query it’s ideal to avoid doing:
The Browse Data In-DB tool: views the data and what’s going on in the workflow. This tool also caches the data. The biggest disadvantage of In-DB tool is that it doesn’t allow you to view what’s going on unless you have a browse tool. This is the only way to view the data in the workflow when utilizing In-DB tools.
The Write data In-DB tools: creates new tables, temporary tables and/or updates existing tables within the database, this is useful for when you Alteryx workflow goes on the scheduler.
Temporary Tables: what does this mean?
The Write data In-DB tools: you can create temporary tables to be written up in the database. Temporary tables are deleted at the end of the Alteryx run. A key thing to remember when creating temporary tables in a database is that these are not managed by Alteryx. This is set up by your database administrator / IT department. Alteryx creates temporary tables by leveraging what has already been set up in the database itself. For this reason, there isn’t a way to change where the temporary tables are being created from Alteryx.
In-DB vs Standard Tools
In-DB tools are faster than the standard Alteryx tools since In-DB tools can blend and analyse large datasets without the need for the data to move outside of the database, unlike the standard tools.
The standard Alteryx tools will bring the data over the network to your local computer. The In-DB Browse and Write Data In-DB will stream out your database and into your local machine, which is why the In-DB Browse tools can significantly slow down your workflow.
- Filter your data in the Connect In-DB tools or as early as possible
- Reduce the use of In-DB Browse tool this increases run time exponentially and decrease performance, ideally if you’re going to put your Alteryx workflow on scheduler I would recommend removing any unnecessary In-DB Browse tool (especially the In-DB browse tool).
- Avoid In-DB Stream out tool if possible
If you’re using IN-DB tools you’ll have moments when you’re wondering: “why is this so slow?” This is because IN-DB tools converts what is happening into SQL code the is being executed, you can view this SQL statement using the Dynamic Output In-DB tools and select the query option.
Benefits to using IN-DB tools:
- You can create hybrid workflow by streaming out this also gives you flexibility
Predictive In-DB tools
You can connect to predictive tools without streaming out however, this depends on the DB that you are connecting to:
Watch out for Part 2 where I’ll be covering how to split to rows using SQL and other In-DB quick tips.