After 4 weeks of intensive training in Tableau and Alteryx, we started our first client project on the 6th July. So far we learnt the core concepts with synthetic data experience.
But Client projects are the great opportunities to work with real-time data on real business critical problems and delivering solutions within stringent timeline using agile methodology. At The Data School we use the Scrum framework.
On Monday morning we had a zoom meeting with the client. We received the datasets, gathered the business requirements, and were briefed about the data. Then we formed a scrum team and created our backlogs. We kicked off the sprint with the Sprint Planning and created scrum board and identified the sprint goal. It was an interesting project, working with the survey data.
The data we received had different granularity and requires data cleansing before we can use them . Especially my part in the project relied on attendance data of Excel files with multiple sheets which were not designed for data analysis. They were created from multiple sources as the data consistency varies in different ways.
Given the data complexity, to analyse the content of these files, we need to find an efficient way to read them automatically and export the data to a format that we could use in Tableau. Tableau comes with an inbuilt data cleaning and preparation tool called Tableau Prep. Although Tableau Prep is a great tool, at this time I couldn’t manage to fix the data discrepancy with it within the stipulated time. The other best option was to use Alteryx.
I have to mention here, that Alteryx core certification preparation helped me a lot to easily navigate through the tools during the data prep. (My recommendation is complete your Alteryx Core Certification before your first project).
Below is the workflow I created with Alteryx. It reads in, the number of Excel sheets, cleaning and making the data more consistent to work with.
The main thing that I want to explain is that data is cleansed after each tool and at the end of the workflow they were union and generated a nicely organised, consistent format ready to be analysed in Tableau.
Key take away, in a real project most of the data will have lack of consistency and they would be spanned across multiple files, sheets and documents. We need to analyse them and find a solution to cleanse in such a way that we do not miss the critical data which will provide key insights to the client.
We had our sprint review On Friday afternoon, where we presented our findings to the client. They were impressed that we were able to use the complete data set and come-up with critical data sets that provided them a new insights into their survey data.
It was a hectic week with lots of leanings. Not just about how to apply data analysis skills and tools, but how to manage time and deliver product using agile methodology.
This week was such a great experience again? .