Today I want to show you how I used R to complete a one-hot encoding cleaning process.
Before I kick off, I’ll briefly explain one-hot encoding. One hot encoding is a useful cleaning process for categorical data. After encoding, predictive analytics can be performed or more importantly Machine Learning can begin!
To do one-hot encoding in Alteryx, you will need a combination of a few workflows, which can be tedious with multiple categorical variables. However, there is a way to do one-hot encoding in one go, via R!
First, you will need the R tool in Alteryx.
Code Description
df<- read.Alteryx(“#1″, mode=”data.frame”)
This will assign the data as a data frame, which can be used in R.
library(mltools)
library(data.table)
These are the R packages that will need to be used from R. mltools is the machine learning tools package, while data.table is a default package from R. If you are using mltools for the first time, you will need to install it from CRAN.
install.packages(“mltools”, repos = “http://cran.rstudio.com”)
Run this and it will install mltools, however, you will need to delete the code after running once otherwise it will keep installing and overriding
df1 <- one_hot(as.data.table(df))
This will then one hot code all the categorical variables.
write.Alteryx(df1, 1)
After we have one hot encoded, this will then put the output in the “1” of the R tool.
Data Before One Hot Encoded