Today I want to show you how I used R to complete a one-hot encoding cleaning process.
Before I kick off, I’ll briefly explain one-hot encoding. One hot encoding is a useful cleaning process for categorical data. After encoding, predictive analytics can be performed or more importantly Machine Learning can begin!

To do one-hot encoding in Alteryx, you will need a combination of a few workflows, which can be tedious with multiple categorical variables. However, there is a way to do one-hot encoding in one go, via R!

First, you will need the R tool in Alteryx.
Code Description
df<- read.Alteryx(“#1″, mode=”data.frame”)
This will assign the data as a data frame, which can be used in R.

library(mltools)
library(data.table)
These are the R packages that will need to be used from R. mltools is the machine learning tools package, while data.table is a default package from R. If you are using mltools for the first time, you will need to install it from CRAN.

install.packages(“mltools”, repos = “http://cran.rstudio.com”)
Run this and it will install mltools, however, you will need to delete the code after running once otherwise it will keep installing and overriding

df1 <- one_hot(as.data.table(df))
This will then one hot code all the categorical variables.

write.Alteryx(df1, 1)
After we have one hot encoded, this will then put the output in the “1” of the R tool.

Data Before One Hot Encoded

Data After One Hot Encoded

You can notice now that the fields have increased from 15 to 67. And as an example, we can see that Education has been one hot coded. I hope this has helped you learn how to use R to one-hot encoding!

Anthony Wong
Author: Anthony Wong