Introduction
For our first dashboard dataset of the week, we were given a dreaded survey data set from the American Housing Survey. As usual, we had to try and remember the training that we had received months ago on how to tackle this data set.
Given that most of the labels are coded and the responses numerically encoded, the first step is to try and understand what the questions are and what the responses are. This requires joining the value labels set to the rest of the survey responses. In the interest of time, I decided to go with the household’s responses only.
Data Wrangling with Alteryx
The first big blockage was figuring out what the heading labels meant, and what the responses meant. The value labels did not fully match up with the response survey meaning that many response values had no definition. Using the SaS file instead provided our much-needed question description. This combined with the Value descriptions gave us the answers we needed to get going.
Realizing that an hour had gone by without much progress, I decided that I needed to create a story with a few fields. I started by removing all useless and replicated fields. I then browsed through the descriptions of every survey question to see if there were any interesting correlations. In the end, the field of neighbourhood crime and area caught my attention. I decided to go with seeing if I could compare the differences in property value based on as many factors as I could find.
Using Alteryx, I removed the columns I didn’t need, transposed some of them and joined them together with the values excel. In the end, I had all the responses for the few fields I wanted, along with the market value of the properties.
Crafting the Viz
The next step is to craft the Tableau visualization. I had to go back to Alteryx a few times to get the data in the correct format for the dashboard that I wanted to create. This included un-transposing Bedrooms and Bathrooms so I can use them as filters.With everything in place, all that was left to do was to create some comparisons between neighbourhood perceptions and property value.
The last big blockage was figuring out how to create diversity in my visualization. Most of the data were very basic, and it was hard to make anything (given the structure) other than bar Graphs. To combat this, I decided to go with familiar report style features such as BANs, small line graphs and a fun shape to top it off.
In the end, this was what I had:
The main takeaway was that neighbourhood areas influence market value. Increased vandalism, crime and roach/rat sightings led to decreased market value. The one exception to this was the perceived risk of disasters and floods led to a higher market valuation. The one thing I would have done differently would be to consider excluding the outliers that skewed my data significantly (as can be seen in the bathrooms graph).
Overall a good start to dashboard week, and a fun challenge! You can find the dashboard here.