In: Statistics and Probability
Goodness of Fit
Microhabitat factors associated with forage and bed sites of barking deer in Hainan Island, China, were examined. The sample of 477 examined sites where the deer forage were categorized by habitat as follows:
Habitat | Woods | Cultivated grassplots | Deciduous forests | Other |
Deer forage sites | 15 | 16 | 50 | 386 |
In this region, woods make up 4.8% of the land, cultivated grassplots make up 14.7%, and deciduous forests make up 39.6%.
Do these data provide convincing evidence that barking deer prefer to forage in certain habitats over others? Answer this research question by conducting a goodness of fit test at 0.01 significance level. Follow the steps outlined in the assignment instructions.
(a) Write the null and alternative hypotheses for this test.
(b) Create a vector of observed values.
(c) Create a vector of assumed probabilities and use it to obtain and display a vector of expected values.
(d) Check that the assumptions required for this test are satisfied.
(e) Obtain the value of the test statistic using the observed and expected vectors from (b) and (c).
(f) Find the P-value of the statistic using the appropriate chi-squared distribution. Plot the distribution, a
vertical line at the value of the test statistic, and show the P-value in the plot.
(g) Verify that your statistic and P-value are correct by using the chisq.test instruction (if less than 1010,
consider both as zero).
(h) Do these data provide evidence that barking deer prefer to forage in certain habitats over others?
Explain.
Solution:-
a)
State the hypotheses. The first step is to state the null hypothesis and an alternative hypothesis.
Null hypothesis: Barking deer do not prefer to forage in certain habitats over others.
Alternative hypothesis: Barking deer prefer to forage in certain habitats over others.
Formulate an analysis plan. For this analysis, the significance level is 0.05. Using sample data, we will conduct a chi-square goodness of fit test of the null hypothesis.
Analyze sample data. Applying the chi-square goodness of fit test to sample data, we compute the degrees of freedom, the expected frequency counts, and the chi-square test statistic. Based on the chi-square statistic and the degrees of freedom, we determine the P-value.
DF = k - 1 = 4 - 1
D.F = 3
b) c)
(Ei) = n * pi
d)
All Expected value should be greater than 5, hence the the assumptions required for this test are satisfied.
e)
X2 = 340.37
where DF is the degrees of freedom, k is the number of levels of the categorical variable, n is the number of observations in the sample, Ei is the expected frequency count for level i, Oi is the observed frequency count for level i, and X2 is the chi-square test statistic.
The P-value is the probability that a chi-square statistic having 3 degrees of freedom is more extreme than 340.37.
We use the Chi-Square Distribution Calculator to find P(X2 > 340.37) = less than 0.001.
Interpret results. Since the P-value (almost 0) is less than the significance level (0.05), we cannot accept the null hypothesis.
h) We have sufficient evidence in the favor of the claim that Barking deer prefer to forage in certain habitats over others.