In: Statistics and Probability
Explain why we must verify whether or not the assumptions of an inferential statistical test are met before we calculate the statistic. Specifically, what does a failure to meet the assumptions mean in terms of the α level of our experiment? What should we do if the assumptions are not met?
As we know that in inferential statistics our main objective is to estimate the unknown parameters. For this, what is required, is a rule/mapping from sample space to the parameter space. And the purpose of getting the estimate of unknown parameters on the basis of sample values is statistic T(X). As T(X) is the function of the sample observations, so it will be a random variable and so a probability distribution.
Most of the statistical test we performed are based on a set of assumptions. If the assumptions violated then analysis of the test will be misleading or completely erroneous. In this case there will be no any interpretation of the result and hence our basic need to perform the inferential statistical test will be meaningless. Because the hypothesis we met before the test will be violated and hence we couldn't draw the conclusions and of course not reach at our objectives
As we define that α is probability of rejection of a lot when it is good and the probability 'α' that a random value of the statistic belongs to the critical region is known as the level of significance. So if a failure does not meet the assumptions mean in terms of the α level of our experiment then hypothesis (which we make before perform the test) will meaningless because on the basis of α value we find the tabulated value of the statistic and compare it with obtained value and hence we draw the conclusion. So if a failure does not meet the assumptions there will be no any sufficient meaning of the test.
If the assumptions do not meet then we go through the non-parametric test or distribution free test. Even both the terms are not synonymous. Roughly speaking, a non-parametric test is one which we makes no hypothesis about the value of a parameter in a statistical density function, whereas a distribution-free test which makes no assumptions about the precise form of the sampled population.