Scientists follow a tried-and-true method for project completion; We write out an Introduction, outline our Methods, report our Results and complete a Discussion to tie it all together.
Seems very… scientific right?
I have already found that this line of thought is useful in day-to-day problem solving – indoor plants keep dying?? Look at what might be causing them to die, implement some changes after a google sesh, observe the results and think about why the changes worked or not.
But what about when tackling a complicated data project? Luckily the same process also works in data analytics!
Let’s take a quick look at what is involved with each step and how they relate to data studies.
This mainly involves determining what information is currently missing from the larger body of information and working out exactly what our aim for the project is.
In science, we perform a “literature review” to determine what direction our potential research should go and what we want to find out exactly. The importance of this new information is also defined, and helps us develop the context in which our information is relevant to. We would also “hypothesise” a potential outcome from our research allowing us to build an informative story.
In data analysis, we do exactly the same thing! We find out what is missing (or what we want to know), why it is important to know that, how it all fits together with the larger picture and what we might expect to find out.
In science we basically tell the reader exactly what kind of test tube we used, the time it was when we cleaned it and even the brand of soap used, this ensures other scientists can replicate our study perfectly.
In data analyses, we also define our methods based on what information needs to be found. We use various programs and technology to achieve our goals and will annotate our work with high detail so that another analyst can take up the reigns and continue efficiently.
Scientists report their findings, whether or not they were successful at achieving their aims. They recite readings from machines, copy and paste output from workflows and present a graph or two (in black and white).
This is not so different in data analyses as we report what we found clearly and concisely… however the major (and best difference, in my opinion) between science and data analyses is HOW we report our findings. It is our goal to make our results easy to interpret, attractive and informative to a variety of audiences.
An integral part of both science and data analyses is how we interpret our findings. We work out why they are important and how we can use that information to improve for the future. Once again both fields achieve this goal, sometimes as presentations or in written form.
In science we refer to previously discovered facts and why ours are similar or different, why that’s an important finding and what can be studied next to further this line of interest. While data analysts may strive to instigate real life changes based on interesting trends in data or show insight into how well a new initiative is working (..or not working).
So what’s the point of all these comparisons? I find I work more efficiently when I take a structured approach to problem solving and was delighted to find out the scientific approach also applies to data analytics. Who knows, perhaps my little discovery might help other data analysts tackle new projects?