Dashboard Week #04: National Hockey League (MoneyPuck)
This blog is part of the Dashboard Week series. If you want to know more about what the Dashboard Week is, have a look at the first blog in the series .

Content warning: This blog is a lot more focused on calculations than visualisations.

 

After coming up with a somewhat unorthodox dashboard for Challenge #03, I was keeping an open mind coming into this challenge, while wondering I have anything else left in the tank. Two more days to go!

 

The Challenge

A couple of weeks ago, we were asked if we have any dataset suggestions for dashboard week. Without any expectations of them being used (of course), I made two suggestions. One of them was data from the wonderful MoneyPuck.com website, on the National Hockey League (NHL), partly because of how much breadth and depth there was in the data, and partly because I’m pretty sure none of us trainees knew much about ice hockey, which makes it interesting while levelling the playing field. (p.s. I actually looked for data on curling first, but couldn’t find anything good)

 

The Data

MoneyPuck.com is very organised with the data, and there wasn’t much cleaning required at all. The main challenges were: (1) understanding the sport and what the different metrics mean, and (2) understanding the granularity of the data, because it is quite complex, both in terms of the number of tables, and the row-level of each of those tables.

Given how little time we had, I knew I couldn’t afford to employ my usual deep-dive/rabbit-hole-jumping approach when I learn about a new sport or topic. I needed something quick, and I thought focusing solely on the goalie (or “goaltenders”, as they are officially known, which makes the position sounds much more prosaic than it actually is) would help streamline things a little — how wrong I was! But more on that later.

As mentioned, the data were well organised, so there was very little cleaning and processing necessary. The only data augmentation I needed to do was to add in the full team name, as the data only had the 3-letter abbreviations in all the tables.

Time to start visualising……

 

Visualisations Calculations

……or maybe not.

After looking at the metrics available in the dataset for goalies, I wanted to focus on two key ones, i.e., expected number of goals conceded in a game (xGoals) and actual goals conceded (Goals), and the latter expressed as a proportion of the former (Goals/xGoals, where <1 is good and >1 is not).

That got me thinking: Is xGoals enough to control for how good is the defensive line in front of the goalie? What about the intangibles, such as the confidence and trust in a good defence? As such, I decided to make some calculations to take a team’s defence ability into account.

Team Blocked Shots Rating

I first looked at blocked shots against. I surmised that a better defence would block a higher percentage of shots by their opponents than the other teams. Besides simply reducing the number of shots that get through to the goalie, this could also frustrate the opponents and forcing them to take more low-quality shots, while giving the goalie more confidence and allowing a more composed performance (as opposed to a goalie who is constantly under siege).

This percentage (“Blocked Shots %”) was calculated for each team in each season, and then scaled as a proportion of the best Blocked Shots % of their respective seasons. The result (“Blocked Shots Rating”) is an indicator that starts at 100 for the best team, and theoretically could go as low as 0 for a team with 0% blocked shots.

Team High-danger Shots Rating

The shots taken against any team were classified into high-, medium-, and low-danger shots in the dataset. My hypothesis is that a good defence would make it harder for their opponents to take high-danger shots and, therefore, have a lower proportion of high-danger shots against to total shots against.

Similar to Blocked Shots Rating, I first calculated a High-danger Shots % for each team in each season, then scaled those as a proportion of the best High-danger Shots % of their respective seasons. But where it then differs is, I still wanted a high rating to be indicative of a good defensive team, so the resultant metric is called “High-danger Shots Reduction Rating”, calculated as the reciprocal of (High-danger Shots % / Best High-danger Shots %). It also starts at 100 for the best team, but could only theoretically go as low as 1 for a team which faced 100% high-danger shots.

Team Defence Rating
Finally, to combine the two metrics above into an ultimate defence indicator without over-complicating things, I went with the simple approach of adding the Blocked Shots Rating to the High-danger Shots Reduction Rating as an intermediate score, then expressing it as a proportion of the best intermediate score for the season. The result, again, is a rating from 0 (theoretical-worst, but unlikely to be observed) to 100 (best possible, will be observed at least once a season).

Augmenting Goalie Stats
After sorting out the team defence ratings, I wanted to add some depth to the analysis of the goalies’ individual performances. The data on Shots was exactly what I needed: a shot-by-shot record which detailed, among other things, the distance and angle from which a shot was taken. I decided to divide the shot angles into thirds, such that each shot was taken either from the left, middle, or right third of the rink. To keep things simpler, I only divided the shot distance into two halves: close-range and long-range shots. The result is a 6-grid shot matrix that would allow an analysis of a goalie’s strengths and weaknesses in facing shots from different angles and distances.

 

Visualisations

Now onto the visualisations for real!
I wanted to keep things simple, so that the main focus is on the metrics I have created. To that end, I wanted to have a white background (like the rink) with multiple line charts for the different metrics. The main filter is for switching between goalies, but I also wanted to (1) allow people to filter by team, in case they know a team but not the goalie, and (2) let people see more info about a selected goalie, other than the stats.

The result is this report:

 

 

J Tay
Author: J Tay