Web scraping! This is the challenge we were given on day 4 of the dashboard week. The website we need to scraping is the CWUR world university rankings page. Our target is the table in the webpage, and we need to use the table to build a dashboard purely on the Tableau server.
Web scraping workflow
The table we want to scrape is only in two lines of the html script, each of which contains half of the university records. The format and order of the attributes of each university are consistent, which makes our scraping easier. The following is an example of Harvard University.
<td>1</td><td><a href=”2021-22/Harvard-University.php”>Harvard University</a></td>
Thanks to the consistent format, I was able to parse the table without relying heavily on Regex.
My Dashboard and how to use it
You can use the dashboard to compare the ranking between universities. The steps are the following:
- Select countries you want to look at
- Adjust the score filter on the right to narrow down the universities
- Use the lollipop chart to observe scores, world rankings and country rankings.
- Select the universes you want to compare in the lollipop chart. A chart that compares different types of ranking between the selected university will show at the bottom as the following:
Tableau server experience
I must admit that building dashboards purely on the tableau server is not a very good experience. You often find that some of the functions you often use on the desktop are not available on the server. For example, when you right-click an axis, there is no format option. I look forward to the future that the Tableau server has the same features as the Tableau desktop!
Click here to access the dashboard.