In week 3 of training at The Data School, one of the topics we learned was APIs. Through APIs we can retrieve even more data than before! We dived into an API of Disney characters thinking it would be a straightforward retrieval. An in-and-out 10 minute adventure…. Like most things involving data wrangling that was not the case.

In this Part I blog post we will go through the data retrieving process.

 

What is an API?

Before we delve into the process, let me review what an API is for those not familiar with them.

An API is a standard code allows different programs/websites/apps to communicate with one another. Kind of like how English is an international standard for communications these days. Currently, the more common method of communication is JSON (pronounced like “Jason”). The API looks like normal HTML coding. But a deeper look and we can see it contains the data we need.

Unedited Disney character API

Depending on the API we may require a key/token to access that data. In the case of the Disney API, we did not require any keys or tokens so the data was ripe for the taking!

Expectations for parsing the JSON API

With the API URL in hand, we could begin taking that sweet, sweet data. The general steps for retrieving data via APIs involves 8 tools in Alteryx: Input, Download, JSON Parse, Text to Columns, Filter, Select, Crosstab, and Output.

Every API is different and we may need to order the tools differently or add tools to the workflow. And we quickly found that the Disney API was a unique beast.

 

Disney API and pages

Hidden within the Disney character API were URLs indicating that there were multiple pages – 149 of them. We now had to factor this into the workflow and paginate the URLs to feed the Download tool. We achieved this with two tools: Generate Rows and Formula.

The Generate Rows tool set at RowCount<=149

Formula adding the RowCount number to the end of the URL

 

 

 

 

 

 

 

 

Connecting the Download and JSON parse tools

After sorting that out, we can now download our Disney character data. Here are the steps:

  • Connect the Download  tool and JSON Parse tool to your workflow.
  • Adjust the settings of the Download tool to the “URL” column and the JSON Parse tool to “DownloadData” column.
  • You can hit Run to run the workflow and download the data.
  • Alternatively, when right clicking on the JSON Parse tool we can select an option to “Cache and Run Workflow”. (See picture below for further clarification)

In Part II, we will go through some other tools required in the workflow to manipulate the data into a more usable format.

Andrew Ho
Author: Andrew Ho

Andrew joins The Data School with more than 10 years of extensive customer service experience in hospitality and administration fields, even owning and operating his own coffee shop at one point. Having identified a newfound interest in Tableau, Andrew is now taking his skills into data analytics. Outside of work, Andrew enjoys spending his time pampering his dog and two cats. He enjoys and appreciates all kinds of food, so you can be rest assured that no food will go to waste.