Day four of dashboard week began with the usual presentation of the previous days work. For me I presented my day two dashboard because I had a shadow day on Wednesday and was out of the office for the entire day. After this we were given a new data set to begin it all again. This time we were given a data set from iNaturalist. This website logs photos of all different species, uploaded by people from around the world. It was a big data set to deal with containing over five million rows of observations.
Preparing the data
This data set did not need too much work. The challenge came from the size of the files. almost 10GB of csv files made joining in Alteryx very slow. So the first thing I did was do the necessary joins and output it to a better format so that I could more easily import back into Alteryx. To avoid blowing out the data to an even larger size I choose to include only one photo for each observation. From here it was just cleaning up some of the dates and creating the spatial points from the given latitude and longitude.
I found this data sources the hardest. Not because of the complexity of the data but instead that it lacks much depth to it. Since it is mostly just a collection of photos. I could have done a dashboard about the website and how much it is being used. But I wanted to focus more on the content and produced the above dashboard.