In: Statistics and Probability
Question 9: Referring to the data from Question 4, comparing tree heights in two different forest areas: a) Assume that this data was collected after a claim was made that the mean tree heights in these two forest areas are equal. Test this claim at LOC = 99%, using the critical value method. b) Explain how the 99% confidence interval for the difference in mean tree heights from these two forest areas (as calculated in Question 4(a)) confirms the result from Part (a) of this question above. c) Use the p-value method to determine if your decision from Part (a) above would change for any of ? = 0.10, 0.05, 0.005, 0.001 . d) Assuming that the samples were truly random, do you think that the average tree heights in these two forest areas are truly different, or are the differences observed probably just attributable to random sampling error? Explain in the context of your answers above (Note: there is no single right answer to this question – but your answer needs to be consistent with the arguments supporting it).
Question 4: The table below shows random sample data of tree heights (in m), taken from two separate forest areas. Tree heights in these forests are assumed to follow an approximately normal distribution.
a) Calculate confidence intervals for the difference between the mean tree heights in these two forest areas, for: i. LOC = 95% ii. LOC = 99%
b) Comment on what the results from Part (a) suggest about any claims that might be made suggesting that the average tree heights in these two forests are about the same. Explain your answer in reference to the confidence intervals which you calculated. Forest Area 1 16.3 19.4 17.4 17.8 17.1 17.8 18.2 19.8 13.4 19.0 18.4 17.0 18.4 19.8 13.2 16.4 18.7 18.1 17.6 12.4 Forest Area 2 19.1 17.8 10.5 17.0 21.8 12.4 15.2 10.9 14.0 14.5 13.9 15.3 14.8 17.0 16.4 13.8 16.5 15.5 15.5 18.4
(a) The sample means are
Also, the sample standard deviations are:
and the sample sizes are n1?=20 and n2?=20.
(1) Null and Alternative Hypotheses
The following null and alternative hypotheses need to be tested:
This corresponds to a two-tailed test, for which a t-test for two population means, with two independent samples, with unknown population standard deviations will be used.
(2) Rejection Region
The critical value for this two-tailed test is , for and df = 38
The rejection region for this two-tailed test is
(3) Test Statistics
Since it is assumed that the population variances are equal, the t-statistic is computed as follows:
(4) Decision about the null hypothesis
Since it is observed that , it is then concluded that the null hypothesis is rejected.
i.e. the mean tree heights in the two forest areas are not equal.
(b) Confidence Interval
The 99% confidence interval is
Since this confidance interval does not contain , thus we reject the null hypothesis.
(c) Using the P-value approach: The p-value is p = 0.0026, and since p = 0.0026 < 0.005,0.01, 0.05, 0.10, it is concluded that the null hypothesis is rejected at =0.005,0.01,0.05,0.10.
But p = 0.0026 > 0.001 it is concluded that the null hypothesis is not rejected at =0.001
(d) Assuming that the samples were truly random, the differences observed probably just attributable to random sampling error because the null hypothesis is rejected.