Its Day four of Dashboard week and perhaps the hardest challenge yet. Today we were given data from various solar panel topics ranging from household solar generation to large scale energy production farms. The Catch – at least one external dataset had to be incorporated and Alteryx was not allowed to be used. Ill admit as a result it took me longer than normal to restructure the data to get it in the right format for my analysis. My advice when wrangling data in tableau would be the following:
Build crosstabs and then lock in values with LODs.
The danger of using data from different sources there can be differing level of detail. For example, the solar panels data I used had energy generation data for every five minutes, however the external data I found had monthly observations. Therefore, to compare them accurately I had to ‘roll up’ the energy generation data to a monthly level. First, I would view this in a crosstab where the rows would be months. Once I am sure the aggregation is correct, I would then lock in these values by creating an LOD calculated field. The new LOD fields mean that I can now refer to these aggregated values without needing to have the month in my view allowing for a more seamless analysis between the two datasets.
Know the difference between relationships and Joins
This is crucial to successful integrate different data connections. Personally, I always prefer relationships over join because it allows a more flexible data relationship. Relationship does not join the data until it knows the context of the analysis. When aggregating data between the two tables it will automatically pick which join to use to avoid issues such as duplicating data. Joins on the other hand will physically join the tables together in the data source which can be advantageous where a hard join is needed, or one table is used as a filter for another.
Write down formulas before implementing them in the Calculated field
When reading data straight into tableau there is a good chance that many new values and aggregations will need to be computed. When these values are then needed to calculate new values, it can get very confusing very quickly. By writing and working out all the formulas before-hand, it is much easier to work out which fields are needed where to calculate new values. In addition to this new fields should be very clearly labelled. This was critical for this challenge as there are many different electricity values and conversions that needed to be considered in the data.
Following the steps above I was able to build two dashboards looking at Solar energy generation around the World. For the final products, see the following links:
That’s a wrap for day four – Almost there!