The Data

The challenge for Day 3 is to access the Star Wars API and create a dashboard.

Data Preparation

The workflow is a pretty standard API workflow. As the API returns URLs instead of names for a number of fields, I just had to use a few Find and Replace tool to fill in all the blanks.

After poking around the API, I decided to use the People, Species and Planets resources for my dashboard. Both Species and Planets can be joined on Planets via the ‘Homeworld’ field, so I figured that would give me a fair bit data to play around. The cleanliness of the data certainly leaves much to be desired, so I had to do a fair bit of cleaning before the data is usable. Besides all the spelling mistakes, lack of title case, and all the same but slightly different entries, I also had to get rid of all the strings showing up in numerical columns. Multi-field formula was a lifesaver for quickly purging all the “n/a’s” and “unknowns”  in the dataset.

The Dashboard

The biggest challenged I found was gathering all the custom shapes necessary to create the dashboard, then mapping them onto the corresponding dimension. Getting the custom shapes was fairly easy after I found out how to extract shapes from an existing workbook. Matching the all 60 planets on the other hand… I will concede that I gave up fairly quickly on that front.

Working on this dashboard revealed a lot of missing data in the API, such as the species of the Skywalker family or the height of C-3PO,many of which are readily available (with citation) on sites like Wookieepedia. It has also not been updated with other materials in the new Star Wars canon, such as the sequel trilogy, standalone movies, and TV shows, although that is understandably a massive undertaking to parse through. Nonetheless, it’s still fairly interesting to explore what’s available, and there are many other details available to enrich each character’s fact sheet.

The Data School
Author: The Data School