<< Back

Dipping my toes into Alteryx Python SDK

What I thought it was going to be:

Write some code, connect to an input & output, maybe allow for some configuration and, voilà, tool ready. Something similar to the R tools that I’ve seen around.

What it actually was like:

I started writing some code to get relatively simple stats for an unrelated problem. I tested my python code, and it worked as expected. Once I had the snippet, I went to read the documentation for the Python SDK and… is that it? Fortunately, there was something else, but that looked more like a data dictionary than actual instructions. I like Alteryx (among other reasons) because it makes it easy to use all the tools (well, most of them —optimisation I’m looking at you) and what I found was not meeting my expectations. The most helpful posts I found (here and here) recommend downloading the sample tools built by Alteryx and modify them to one’s heart content. So that’s what I did.

After a few hours of trying to get my tool to work, I took a couple of steps back and decided to create a minimal tool to start understanding all the requirements and oddities of the tool configuration. Once I have that, I will slowly move forward.

A Minimal Python Tool:

For this test, I chose to develop one tool with no input and no configuration that will return a single column and a single row with some text. Could it be any simpler?

What I am after – a very very simple tool (click to enlarge).

Overview:

The tool will need:

  1. A configuration file that will include the name of the connections, the anchor abbreviations, etc.
  2. A GUI config file: this is the configuration pane that all alteryx tools have. It can be modified to accept freetext, numbers, radio buttons, field selectors,…
  3. The python module:
    1. The code itself.
    2. Initializing the input and output connections.
    3. Moving records from upstream, through the tool and downstream.

Recipe:

  1. Choose a name and create a folder with that name.
  2. Copy all the files from the “INPUT” sample (and drop any language-specific files, I am using this one as the template because I do not intend to use any upstream connection) and rename them so that “Python – Input” is replaced by your folder’s name. I.e. I call the folder 0_PythonExample, so I now have four files beginning with that string:
  3. Edit ...Config.xml:
    1. Engine: replace the “Python – InputEngine.py” with your folder’s name +Engine.pyimage.png
    2. Gui settings: replace the files and help pointers to your own choice:
      image.png
    3. For now, input connections (None) and output connections (one named “Output”) are OK for my needs.
    4. Change the properties to display useful information.
  4. Edit ...Gui.html:
    1. Update the title.
    2. Since this tool takes no input and no configuration, the fieldset section can be cleaned:
  5. Edit ...Engine.py:
    1. The tool will not take any input, so all the methods of the class IncomingInterface can be safely left in “pass”.
    2. The main modifications will happen to the class AyxPlugin:
      1. __init__:
        1. Keep the #default properties.
        2. Drop the #custom properties.
        3. Add the text to be written to the output:
      2. pi_init: receives the information from the configuration from the GUI. In this case, there is no configuration, so I only need to initialize the output. If the tool had any configuration, this is where the information would be collected as part of the xml string.
      3. pi_add_incoming_connection: nothing going on here since the tool does not require/accept upstream connections.
      4. pi_add_outgoing_connection: set to True to accept the ouput connection.
      5. pi_push_all_records: this is the part that will push the records to the output stream.
        1. I am going to follow the approach in the Input Sample and use a helper method to provide the layout of the output, that is, an empty table with the column definitions:
            1. The layout is a RecordInfo object, I am maintaining the naming convention and calling it “record_info_out”.
            2. Each field will be a new column.
            3. Here, I call the new column “NewText” and set it as a string of size 254.
            4. If I had multiple columns, each of them should be defined here.
        2. With the helper finalized, I can go now to  the main method and get the layout
        3. Then, initialize the output.
        4. And start creating the records: in this case there is a single record (row) with a single field (column). For a more general case, each record is prepared by populating all of its fields. Once all the values are ready, the record must be finalized and then pushed out to the downstream anchor. Once the first record is  “pushed” the second can be processed:
        5. Once all the records have been streamed out we can send a message to the Alteryx interface with a summary:
        6. And, finally, close the output connections:
  6. Create an icon and save it as ...Icon.png
  7. Copy the whole folder into the alteryx folder containing html tools (i.e. C:\Program Files\Alteryx\bin\HtmlPlugins\)
  8. Close Alteryx and launch it again…Magic! the tool should be available now.
  9. Create a package (.yxi) to distribute the tool (instructions here and here for more detail).
    1. Download the installer for this tool: install it by double clicking on it, it will appear in the “Laboratory” tools. Unzip the .yxi to explore the files.

 

This was a first approach to the Alteryx Python SDK. The tool itself is pretty useless, but it helped me understand the structure of the files and the AyxPlugin class.

In the next blog post, I will start adding some basic functionality (GUI interface, increase the number of records and the number of fields,…).

 

Leave a Reply

Your email address will not be published. Required fields are marked *