In: Statistics and Probability
Teeth and military service. In 1898, the United States and Spain fought a war over the U.S. intervention in the Cuban War of Independence. At that time, the U.S. military was concerned about the nutrition of its recruits. Many did not have a sufficient number of teeth to chew the food provided to soldiers. As a result, it was likely that they would be undernourished and unable to fulfill their duties as soldiers. The requirements at that time specified that a recruit must have "at least four sound double teeth, one above and one below on each side of the mouth, and so opposed" so that they could chew food. Of the 58,952 recruits who were under the age of 20, 68 were rejected for this reason. For the 43,786 recruits who were 40 or over, 3801 were rejected.
a) Find the proportion of rejects for each age group.
b) Find a 99% confidence interval for the difference in the proportions.
c) Use a significance test to compare the proportions. Write a short paragraph describing your results and conclusions.
d) Are the guidelines for the use of the large-sample approach satisfied for your work in parts (b) and (c)? Explain your answers.
a) = 68/58952 = 0.0012
= 3801/43786 = 0.0868
b) The pooled sample proportion(P) = ( * n1 + * n2)/(n1 + n2)
= (0.0012 * 58952 + 0.0868 * 43786)/(58952 + 43786)
= 0.0377
SE = sqrt(P(1 - P)(1/n1 + 1/n2))
= sqrt(0.0377 * (1 - 0.0377) * (1/58952 + 1/43786))
= 0.0012
At 99% confidence interval the critical value is z0.005 = 2.58
The 99% confidence interval for the difference in population proportion is
() +/- z0.005 * SE
= (0.0012 - 0.0868) +/- 2.58 * 0.0012
= -0.0856 +/- 0.0031
= -0.0887, -0.0825
c) The test statistic z = ()/SE
= (0.0012 - 0.0868)/0.0012
= -71.33
P-value = 2 * P(Z < -71.33)
= 2 * 0 = 0
At 5% significance level, since the P-value is les than the significance level(0 < 0.05), so we should reject H0.
So we can conclude that there is a significant difference between the population proportions.
d) Yes, the guide lines for the use of the large-sample approach is satisfied, because the sample sizes for both the samples are very large and each trials are independent in both the sampkes. So that it is appropriate to use normal approximation.