Introduction

Today is our last day in dashboard week. Our task for today was to visualise diseases in the US at a tract level of granularity (small area). The idea I wanted to explore was whether the characteristics of a certain area or state contributed to the value of the question being asked, for example, smoking or alcoholism. The data set is incredibly rich, providing us with 2 years (2016-2017) worth of data, however a year on year analysis wasn’t possible because the same question wasn’t asked in both years, only in one or the other. The data can be found here.

Preparation

Nick Hills managed to get us the spatial file of the tracts so we could join it with our data set, this was valuable as it contained all the information ready to do my analysis.

To prepare the data I had to clean some of the null values in location and value fields. Some of the states were incorrectly spelt, so that required change also. I used a spatial match to get where the tracts were located (in what state). As far as I can see there weren’t data quality issues. I started with 810,103 records, with filtering out some of the “bad” data I end up with 759,187.

Visualisation

I had many different ideas for insights, but the one I ended up going with was the exploratory aspect of disease affliction and race by tract. It seems very simple to show bar charts, but we had only 5 hours to clean, create and write a blog about this visualisation. Some measures of health afflictions were not bad, for example, “Checkups, dentists appointment, etc” so I decided not to include them, as I wanted to focus on the bad aspects of healthcare in the USA

This viz has a lot to explore, for example, which race has different levels of health affliction and by how much. Going forward with this I would probably use a different way as to compare the different races side by side instead of manually clicking to see the difference. In this viz I also make use of the animations. The viz can be viewed here. This is the final blog for dashboard week, my experience with it on the whole has been great, almost like doing a makeover Monday every day. I included a viz in tooltip just in case the location wasn’t clear as to where the locations are.

TL;DR

I used 500 cities data, supplemented it with census data on the tract level to gain insights into different areas. I found that lack of sleep was the highest health affliction on average.