Yes, the point of this article is in the title. While it is not a particularly useful neural network, it does technically work. You can find the packaged workflow containing the neural network here. Before I continue, however, I feel compelled to include a short disclaimer.

There are many ways of implementing artificial neural networks within Alteryx. Alteryx’s predictive tools suite includes a neural network tool that is capable of applying a simple neural network. Alteryx also has access to SDKs that can be used to develop complex ANNs that can fit into Alteryx workflows. The python and R SDKs in Alteryx allow access to state-of-the-art machine learning libraries for building neural networks such as PyTorch and Tensorflow. You can also use Alteryx-integrated machine learning tools such as H2O.ai to implement intelligent learning.

I mention these methods to disclaim how redundant it would be to build a neural network from scratch with Alteryx functionality. Alteryx tools are not meant for building neural networks. Alteryx deals with tabular data, whereas the efficient building of a neural network requires efficient handling of vectors and linear algebra and support for optimized algorithms. This is not a condemnation of Alteryx by any means – Alteryx is simply built for other tasks. As I have highlighted, Alteryx is more than capable of running and utilising neural networks. It is not, however, designed for building them from the ground up.

So Why Have I Done This?

Clearly there must be a catch. After all, even thought I have outlined why you should not build an ANN in Alteryx, I have done just that myself. Essentially, the neural network I have built is a toy network. It runs extremely slowly compared to any other ANN tool and it does not perform particularly well. I built it as an exercise in two skills. Firstly, to cement my knowledge of the basics of neural networks (which I have been reading about). Secondly, as a challenge to stretch Alteryx’s capabilities and build something that Alteryx itself is not designed to build. As far as meeting these two goals, the neural network performs well (and I had fun building it).

How Does it Work?

This blog will not go into detail on how neural networks operate in general. I will use terms like “gradient descent” and “backpropagation” without fully explaining them. However, if you are interested in learning more about the inner workings of ANNs I highly recommend this series of YouTube videos that explains them in simple terms. Other resources on ANNs are abundant online – a quick google search will return a huge number of articles that explain them in simple terms.

What I will explain here is the purpose of my neural network and the ways I dealt with some of the challenges of building it in Alteryx. The aim of the neural network is the same as the one described in the above video. Given images of hand-written digits, the job of the ANN is to classify what digit they are. This is the MNIST dataset – it is often described as image recognition 101 – and it is the most common image recognition dataset for beginners to machine learning.

Challenges in Alteryx

My Alteryx approach was reasonably simple. Most of the complexity of building a neural network is at the training step. To do this, I wrote an iterative macro that applies the forward pass, backpropagation and gradient descent steps at once. In doing this, all operations that would usually be computed by linear algebra were created by combinations of join, formula and summarize tools in Alteryx. Below is a screenshot of the inside of the macro for a reference of the complexity of this step.

Configuring the iterative macro was also a challenge. Since the iterative macro must train the weights of its previous iteration, I had to combine all its inputs to a format that it could output in its iterative output. I did this using two macros – called “encode” and “decode” that could convert the data into a format compatible with the iterative macro.

This training stage was most of the work. I also created a “validation” macro that runs the forward pass of the ANN to create predictions using a test set. I created several other macros for smaller tasks, such as creating epochs, initializing weights, and iteratively converting data from an 8×8 image to a 64 long column.

How Does it Perform?

This is the million dollar question. Unfortunately, the answer is very badly. Firstly, the neural network takes about 30 minutes to train on 5 epochs. For reference, Keras (a popular ANN API) will train 100 epochs on the same ANN architecture in a matter of seconds. This is a bleak comparison. However, I knew at the outset of the project that it would run slowly. Creating an optimized ANN was never the goal.

As far as accuracy, the neural network still performs poorly. With a learning rate of 0.01 and trained over 5 epochs it gets a test set accuracy of about 54%. For comparison, far simpler models such as logistic regression can easily achieve over 90% accuracy on the same problem. However, I am still optimistic about this result. A random classifier would on average achieve 10% in this problem. That is, if I randomly guessed every prediction I would still get about 10% of them correct (since there are 10 possible digits). If my learning algorithm achieves significantly higher than 10% accuracy, it can be said to be learning. And in this challenge, the network is a huge success – 54% is much higher than 10%! The ANN is learning – it just runs into several pitfalls along the way. This is also no surprise – it is the most basic model possible, and it includes none of the usual features that ANNs usually include to improve their effectiveness.

Running the ANN for more epochs, which could perhaps help the model performance, unfortunately fails to improve the score. The validation accuracy actually decreases after a number of epochs. This indicates that the gradient descent process is overfitting the training set and falling into a local minimum. This could be improved by finding a better learning rate schedule or by using a more effective gradient descent algorithm. In fact, there are a myriad of methods that could be used to improve how the neural network performs. These are technical and difficult to implement. For now I will be content with creating a working neural network in Alteryx, even if it is not the most accurate.