The second day into the dashboard, we were given data from MSI(Maritime Safety Information).

The data can be downloaded in different formats such as HTML and XML. HTML format comes with spatial columns however harder to parse compare with CSV. Lucky enough, the CSV file comes with Degrees Minutes Seconds (DMS).

After cleaning in Alteryx I now have spatial for the location of the incident. After noticing two columns, ‘navArea'(Navigation Area) and ‘subreg’,

I spent a considerable amount of time looking and trying to supplement my data with Nav Area. However, there is limited information out there and I couldn’t finish this task in 2 hours.

Wasted a bit time around looking for supplement data, I then decided to just incorporate

The battle deaths dataset from prio.org.

I think this is an interesting dataset and was originally trying to join the spatial data with the battle death data. However, there is no data in Battle Death that can be used to create spatial data. I spent around another two hours trying to utilize the supplement data to the best I can.

Due to time-constrain, I decided to aggregate the Battle Death data to year level and simply compare with the number of piracy activities.

I then create a group some of the incidents together based on locations and form groups. Presented in small multiple density maps, the five regions are a better representation that one complete map as the later has too many empty spaces where lack of piracy activities.

The biggest lesson I learned today is to not waste too much time looking for supplemental data. This is especially the case when it comes to dashboard week. It is quite easy to fall into the traps of dwelling on the thoughts of an amazing dashboard with perfect supplement data. During the rest of the dashboard week after Tuesday, I believe I did take the lesson learned into consideration and more conscious of time constraints.

Junya Wang
Author: Junya Wang