Yo, so a word cloud is basically a way to show text data with the size of each word indicating how often it appears in the text. This technique is especially useful when we are dealing with text data, such as tweets, reviews and open answer survey replies.

Here are a few reasons why word clouds are dope:

  1. Visual representation: Word clouds are a great way to give a visual representation of text data. By showing the most common words in a bigger font, it’s way easier to understand the main topics or themes within the text.
  2. Data exploration: Word clouds can be used as a tool for exploring data, making it easy to see patterns, trends, and relationships within your data. This can be especially useful for identifying key phrases or words that come up a lot in a big chunk of text.
  3. Communication: Word clouds can be used to communicate important info in a concise and visually appealing way. For example, a word cloud can be used in a presentation or report to sum up the main points or themes of a document.
  4. Marketing: Word clouds can be used in marketing research to figure out the most common words or phrases associated with a certain brand, product, or service. This can be helpful in making marketing strategies or campaigns that will really speak to your target audience.

Lets use the superstore data and Tableau to see what subcategory is most often ordered. To achieve that, use the following steps:

  • Add your text field to text in Tableau, we are using Sub-Category for this example.
  • Add the fields that will define the size of the text and the color to size and color. We will be using count of subcategory, however, we can use other numerical fields such as sales or profit.
  • You are done! You have a wordcloud ready!

Please note that when working with text in most cases, the data needs prior cleaning. If we were working with tweets, we would do a prior data cleaning. Here is a short list of what to keep in mind while working with strings and wordclouds:

  • applying lowercase
  • removing the numbers
  • removing links
  • removing punctuation
  • removing stop words
  • lemmatization (this allows us to combine similar words, for example ‘changing’, ‘changed’ would be swapped for ‘change’)
  • tokenization (In data analytics, tokenization is the process of breaking down a text document or sentence into individual units, or tokens, such as words or phrases.)

All in all, word clouds are a dope tool for analyzing and communicating text data in a way that’s easy on the eyes.

Veronika Varaksina
Author: Veronika Varaksina

Meet Veronika, a dynamic and adaptable individual with a diverse background in economics, accounting, finance, and data analytics. Veronika pursued a Bachelor’s degree in Economics and gained valuable experience in financial analysis, budgeting, and forecasting while working for five years in accounting and finance. However, she soon realized her passion for data analytics and decided to pursue a postgraduate degree in Analytics at Victoria University. Throughout her academic journey, Veronika honed her skills in data visualization, statistical modeling, and machine learning. Her expertise earned her a spot in the highly competitive Data School program, where she further continues to expand her skills in data analysis.