Introduction

To get more practice of connecting to API’s and a common use case of how you will manipulate your data for analysis we will now try to connect to The Movie Database (TMDb)

Example 2: The Movie Database (TMDb)

API documentation: https://developers.themoviedb.org/3

Most websites have a requirement to request an API key and this website is no different. I needed to make an account and after applying for an API, I received the following:

Let’s look at getting the top movies in terms of popularity. I will explain what is happening within the API documentation.

As the endpoint is /movie/popular, the URL to retrieve this info is https://api.themoviedb.org/3/movie/popular?api_key=c03435da915125deba6d01b33424969a

Note that we need to put the api_key after the “?”. “?” refers to the queries within the query string. Therefore, if we wanted to see another page of the data, the url will be:
https://api.themoviedb.org/3/movie/popular?api_key=c03435da915125deba6d01b33424969a&page=5

Finally, if the response we get from the API is 200, we will get a json format response with the following tables of data.

After json parsing the data, we will receive data in this form:

We must use a text to columns to split the “JSON_Name” by the delimiter “.” The number after “results.” Represents each individual movie ID

We will have now have to use a cross tab with the following settings

Here, you may notice that there are multiple genre fields and this will be difficult for tableau to do an analysis on. It is recommended to de-normalize the data and create a table separately for genre.

To do this, we add a simple genre filter in a step before the cross tabbing as such:

Here we notice the value string is a number and this number is a code for different genres. We need to call upon the genre api connection.

The link we need to download from is: https://api.themoviedb.org/3/genre/movie/list?api_key=c03435da915125deba6d01b33424969a

After a similar process of text to columns and cross-tabbing, we get the following result for our genre ID’s:

The Final Result

The final workflow is shown below:

After joining our genre ID’s with the movie list, we produce the following Genre’s Table:

And The movies table looks as such:

This is a common problem that occurs where API’s call from multiple endpoints. You can now import these 2 tables for your analysis.

Thank you for reading 😊

Nam Nguyen
Author: Nam Nguyen