In the last blog, we looked at the customer review sentiment. Here is another question.

When we are analysing customer reviews, how do we classify them? Because reviews are assigned to different categories. Knowing each review belongs to which category would be a more comprehensive way to process future work.

In this blog post, I would like to show another Natural Language Processing tool in Alteryx, which is the Topic Modelling tool. Also, to make better topic modelling, another Text Pre-processing tool should be used to filter out all the stop words and other unnecessary words to make the topic more precise.

Here are these two tools.

The configuration of the Text Pre-processing tool is like this.

After that, when configuring the Topic Modelling tool, we need to choose how many topics we need. This configuration often cannot be completed at one time.

As we need to run the workflow several times to look at the report output from the R anchor.  And check the collection of a bunch of highly relevant words related to this topic. In order to make the most suitable topic modelling.

Here is an example of how I make the topic modelling.

This Chart is the intertropical distance map. Each bubble on this chart is a topic.

That’s how we define which topic should each comment goes to. As we can see, topic 1 got some words such as Service, people, friendly, Tell, careful, look, manager, Serve, and smile…which means they are related to the staff and people in the store. So, we assign this topic to Customer service.

Topic 2 got the words such as Experience, bad, machine, bag, Check, cost, pay, music, Replacement, and order …They are highly related to the environment in the store, and we assign them to the topic of Customer Experience.

Topic 3 got common words such as Fruit, food, watermelon, tree, Expensive, vegetable, plastic, Pineapple, and Panadol…it’s easy to decide on this topic as product quality.

All right, that’s pretty much the approach of how we do the sentiment analysis. After it, we got several good, structured datasets, and were ready for future visualisation.

Chuck Wang
Author: Chuck Wang