In: Statistics and Probability
explain various unknown methods of non parametric tests (NPT) with their distinct role and applications. Also state the formula for each of the Three NPT
non-parametric statistics:
To a statistician, a parameter is a measurable characteristic of a population. The population characteristics that usually interest statisticians are the location and the shape. Non-parametric statistics are used when the parameters of the population are not measurable or do not meet certain standards. In cases when the data only order the observations, so that the interval between the observations is unknown, neither a mean nor a variance can be meaningfully computed. In such cases, you need to use non-parametric tests. Because your sample does not have cardinal, or interval, data, you cannot use it to estimate the mean or variance of the population, though you can make other inferences. Even if your data are cardinal, the population must be normal before the shape of the many sampling distributions are known. Fortunately, even if the population is not normal, such sampling distributions are usually close to the known shape if large samples are used. In that case, using the usual techniques is acceptable. However, if the samples are small and the population is not normal, you have to use non-parametric statistics. As you know, “there is no such thing as a free lunch”. If you want to make an inference about a population without having cardinal data, or without knowing that the population is normal, or with very small samples, you will have to give up something. In general, non-parametric statistics are less precise than parametric statistics. Because you know less about the population you are trying to learn about, the inferences you make are less exact.
When either (1) the population is not normal and the samples are small, or (2) when the data are not cardinal, the same non-parametric statistics are used. Most of these tests involve ranking the members of the sample, and most involve comparing the ranking of two or more samples. Because we cannot compute meaningful sample statistics to compare to a hypothesized standard, we end up comparing two samples.
1)
Mann-Whitney U-test:
“The t-Test”, you learned how to test to see if two samples came from populations with the same mean by using the t-test. If your samples are small and you are not sure if the original populations are normal, or if your data do not measure intervals, you cannot use that t-test because the sample t-scores will not follow the sampling distribution in the t-table. Though there are two different data problems that keep you from using the t-test, the solution to both problems is the same, the non-parametric Mann-Whitney U-test. The basic idea behind the test is to put the samples together, rank the members of the combined sample, and then see if the two samples are mixed together in the common ranking.
The modules on hypothesis testing presented techniques for testing the equality of means in two independent samples. An underlying assumption for appropriate use of the tests described was that the continuous outcome was approximately normally distributed or that the samples were sufficiently large (usually n1> 30 and n2> 30) to justify their use based on the Central Limit Theorem. When comparing two independent samples when the outcome is not normally distributed and the samples are small, a nonparametric test is appropriate.
A popular nonparametric test to compare outcomes between two independent groups is the Mann Whitney U test. The Mann Whitney U test, sometimes called the Mann Whitney Wilcoxon Test or the Wilcoxon Rank Sum Test, is used to test whether two samples are likely to derive from the same population (i.e., that the two populations have the same shape). Some investigators interpret this test as comparing the medians between the two populations. Recall that the parametric test compares the means (H0: μ1=μ2) between independent groups.
In contrast, the null and two-sided research hypotheses for the nonparametric test are stated as follows:
H0: The two populations are equal versus
H1: The two populations are not equal.
This test is often performed as a two-sided test and, thus, the research hypothesis indicates that the populations are not equal as opposed to specifying directionality. A one-sided research hypothesis is used if interest lies in detecting a positive or negative shift in one population as compared to the other. The procedure for the test involves pooling the observations from the two samples into one combined sample, keeping track of which sample each observation comes from, and then ranking lowest to highest from 1 to n1+n2, respectively.
U1=n1n2+[n1(n1+1)]/2−T1
where
T1 = the sum of the ranks of group 1
n1 = the number of members of the sample from group 1
n2 = the number of members of the sample from group 2
2)
Hypothesis of 1 sample Wilcoxon Signed test
For the left-tailed test:
For right-tailed test:
Assumptions of the one sample Wilcoxon test
Procedure to execute One Sample Wicoxon Non Parametric Hypothesis Test
Where t is the ranks of tied values
3)
Kruskal-Wallis test:
The Kruskal-Wallis test is a nonparametric (distribution free) test, and is used when the assumptions of one-way ANOVA are not met. Both the Kruskal-Wallis test and one-way ANOVA assess for significant differences on a continuous dependent variable by a categorical independent variable (with two or more groups). In the ANOVA, we assume that the dependent variable is normally distributed and there is approximately equal variance on the scores across groups. However, when using the Kruskal-Wallis Test, we do not have to make any of these assumptions. Therefore, the Kruskal-Wallis test can be used for both continuous and ordinal-level dependent variables. However, like most non-parametric tests, the Kruskal-Wallis Test is not as powerful as the ANOVA.
Null hypothesis: Null hypothesis assumes that the samples (groups) are from identical populations.
Alternative hypothesis: Alternative hypothesis assumes that at least one of the samples (groups) comes from a different population than the others.
Example questions answered:
How do test scores differ between the different grade levels in elementary school?
Do job satisfaction scores differ by race?
The distribution of the Kruskal-Wallis test statistic approximates a chi-square distribution, with k-1 degrees of freedom, if the number of observations in each group is 5 or more. If the calculated value of the Kruskal-Wallis test is less than the critical chi-square value, then the null hypothesis cannot be rejected. If the calculated value of Kruskal-Wallis test is greater than the critical chi-square value, then we can reject the null hypothesis and say that at least one of the samples comes from a different population.
Assumptions
Related Pages:
where: