Planes dropping out of the sky! Today’s Challenge was to web scrape the database of http://www.planecrashinfo.com/database.htm. This included many pages of fatal crashes across the length of aviation history. This pleasant topic surely won’t induce certain phobias. All jokes aside, lets get into the dashboard making process.

Coming Up With An Angle

After scraping and parsing the HTML from the webpages using Alteryx, I looked across the website at other pages of data, included things like celebrity aircraft deaths. One which caught my eye was the sabotage and hijacking category.

My initial approach was to web scrape this data and join it to the total aircraft dashboard. I spent a lot of time parsing and joining to the main dataset, however I ran into issues that I didn’t have time to go through for the challenge. So I reluctantly changed tack, and went with another idea.

My idea was to create a dashboard exploring the manufacturer data using a menu selection interface. Once a manufacturer was selected, a series of big numbers (BANS) and charts indicating fatality statistics would filter to that manufacturer.

Data Preparation

Using the parsed data from the web database, I extracted the manufacturer from the aircraft type column. When bringing this into tableau, there were issues with spelling errors in the dataset, so I grouped the manufacturers in tableau.

Click to view on Tableau Public

Key Findings

Boeing had the highest fatalities, which is not unexpected given its popularity as a large airliner. Boeing planes were also involved in the 9/11 terrorist attacks. And we can see that it has a high number of average deaths compared to Douglas.

Frank Salmon
Author: Frank Salmon