Humans are utterly fascinated by the idea of dying in a plane crash or accident. We consume a lot of non-fiction television on the topic, including most famously the show Air Crash Investigation, or Mayday in the United States and Canada. It also crops up constantly in fiction, such as with the Final Destination movie franchise.

Today, my Data School team was tasked with web scraping a database of plane crashes and then creating a dashboard from the information. This was quite a different ballgame from the previous two days, and not just because of the dark subject matter. I consider web scraping one of my weaker areas, which, coupled with the fact that even easy web scraping is an involved process compared to simply uploading Excel data, made for a difficult time in Alteryx. Nonetheless, I kept trying until I got exactly the dataset that I wanted.

All of those RegEx tools were necessary to parse the raw web scraped data… and to then parse the parsed data. For instance, the location information took the form of “Berlin, Germany,” or “45 miles west, Manila, Philippines,” and that meant multiple extra steps to identify the format of each location string and then parse out the state/city and country correctly.

Once I got all the data I needed, I set to work producing the Tableau dashboard. Today, I took inspiration for my dashboard from a well-known YouTube video, The Fallen of World War II.

This video, which is effectively a series of animated graphs such as the one above, is also notable for having an interactive clone on the creator’s website. Seeing this video, I figured — why not create a dashboard with the same premise, with some interactivity but mainly focused on animation?

Ultimately, I came up with this dashboard:

To create it, I made use of the pages card, which Data Schoolers typically ignore. By pressing the play button as instructed, the dashboard will automatically show you all 100+ years of recorded air accidents, updating every graph and figure each year.

You can play around with it here: Air Accident Fatalities 1908-2021 | Tableau Public

Overall, I’d consider today a strong success, although the running total graphs took me longer than I’d care to admit. It was one problem after another — getting the running totals to calculate properly was difficult on its own, but once they were working, I realized that the running totals of operators would disappear in years when the operator in question had no fatalities. I had to go back to Alteryx, cross-tab the data, fill out all the no-fatality years with “0” for every operator, then transpose the data back into its original format. I then had to repeat this process for countries. That was rough. But I’m pleased to announce that I’m still alive and kicking.

The Data School
Author: The Data School