Consider the following scenario, students from two different schools complete the same exam. The score averages for both schools differ. But how do we do know that this difference is caused by the education standards of the school or whether it just natural variation. This type of problem can be solved using a t-test.
So, what does this look like in Alteryx?
The test of means tool can be used to determine whether two groups are statistically different. The configuration requires that the response measure and the group identifier be selected.
Looking at the animation above we can see that there are two outputs. The R anchor which is just a report and the D anchor which has a table of results. The figure we are most interested in is the p-value.
What is the p-value?
P-value is the chance that the difference observed is due to random variation, i.e., a lower p-value is means that there is a higher chance that the difference is due to some other factor that is not random chance.
In our example the p-value is 0.0004. This can be interpreted as there is a 0.04% chance that the difference in exam scores between the two schools is due to natural variation. The general consensus is that any p-value less than 0.05 is enough evidence to reject the hypothesis that random chance is the cause of difference of mean. So, in our example there is evidence that the difference in exam scores is not a result of randomness.
Note that the p-value is only evidence for or against random variation, it alone is not proof that education standards differ between the schools. There could be other confounding factors that can influence the scores such as age, exam conditions etc. This is why in scientific experiments; scientists create testing groups that only differ in the variable they are testing.
For a more detailed case study, here is an article examining the gambling habits of Australians:
Well that is it for my first blog, I will be writing more so stay tuned and do not hesitate to reach out!