Tips for checking your Alteryx workflow
Have you ever run your Alteryx workflow only to find that the data ouput does not make sense? Or worse still, an end user of the data has come back to point out that the numbers don’t add up? Alteryx will flag up where a tool fails (eg. if a referenced field does not exist), but there are many ways to make mistakes which still give an output without alteryx flagging an error.
Here are my tips to check your workflow before it goes into production:
Turn on connection progress
Click on the canvas, then change connection progress to Show (the default is Show only when Running). Run your workflow then zoom in until you can see the number of records passing between the tools. If there are any join tools which are not configured correctly (or the data was not as expected) then you may find that the number of records has increased where it should not be. If the number of records is suddenly reduced you will be able to pinpoint the filter or join tool where this is occurring.
Run and random sample of 10 records then check the previews
When working with large datasets it’s good practice to run a sample through your workflow to speed up testing. Clicking on the outputs of the tools will allow you to see the values for all fields. Never underestimate the value of using the human eye to check through the data, you may notice something that you wouldn’t even think of testing. For example I have spotted typos in a database using this method. Using a random sample rather than just sampling the first n means that you are less likely to get caught out with inconsistent data later in the dataset.
Check your joins
Click on a join tool to see how many records were output to left, join and right. Eg. Join (19): 12 records were joined with 0 un-joined left records and 0 un-joined right records. Click the R,J and L outputs to preview if you were not expecting data to be present in any of the nodes. If you have connection progress showing you can also see if the number of records has been multiplied.
Filter for null values
A formula tool may be outputting null values due to an incorrect calculation or unexpected data (eg. parsing a datestring in an inconsistent format). You could filter several fields at once using a custom filter isnull(Field1) or isnull(Field2) etc.
Check your totals and unique dimensions
Add a summarise tool to check unique string fields that are present in the data (eg. group by Country). Use the summarise tool to check totals at different points in your workflow eg. sum of sales.
Use the test tool
Your data preparation may have specific nuances that you can test. Eg. All values in a field should sit between x and y.