In: Statistics and Probability
The following table considers survey data about annual household income and whether or not a person vapes. <$35,000 $35,000-$99,999 $100,000+ Total Vape 47 57 19 123 Do Not Vape 381 659 362 1402 Total 428 716 381 1525 Problems Problem 1. Based on recent data, about 28% of Americans earn less than $35,000 annually, about 42% of Americans earn between $35,000 and $99,999 annually, and about 30% of Americans earn more than $100,000. Does it appear that the sample is representative of the population? In other words, does it appear that the total people in each income category matches the appropriate proportion? Conduct a chi-square goodness of fit test at the 5% significance level by completing the following steps: a. (2 points) State the null and alternative hypotheses in words. b. (2 points) Compute the expected frequencies for each of the three category. Be sure to show your work. c. (2 points) Compute the test statistic using the observed frequencies from the table and the expected frequencies you computed in part (b). d. (2 points) State the degrees of freedom and find the critical value. e. (2 points) Answer the question: does it appear that the sample is representative of the population? Justify using either the critical value method or p-value method. Problem 2. We now wish to decide if there “use of e-cigarettes” and “income category” are dependent. To assist with this process, the table from before has been augmented with most of the expected frequencies (listed in parentheses): <$35,000 $35,000-$99,999 $100,000+ Total Vape 47 (34) 57 (58) 19 (31) 123 Do Not Vape 381 (394) 659 (???) 362 (???) 1402 Total 428 716 381 1525 a. (2 points) Find the missing two expected frequencies, labeled as (???). For credit, be sure to show your work. b. (2 points) From a brief analysis of the frequencies, do you believe that there is convincing evidence of a dependence relationship between income category and whether or not a person vapes? Explain. c. (2 points) Based on the given information, the test statistic computes to be 10.47. Based on this, what is the conclusion of the hypothesis test at the 0.05 significance level? Be sure to justify your answer by using either the p-value or critical value method
statistics 10
a)
Null hypothesis : Ho: The proportions of each category matches with the given population proportions
Alternate Hypothesis : Ha : The proportions of each category does not matches with the given population proportions
b.
Total number of Americans in the survey = 1525
Expected frequency for category of Americans earn less than $35,000 annually = 28% of 1525 =427
Expected frequency for category of Americans earn between $35,000 and $99,999 annually = 42% of 1525 = 640.5
Expected frequency for category of Americans earn more than $100,000 = 30% of 1525 = 457.5
c.
d. Degrees of freedom = Number of categories - 1 =3-1=2
Critical value of at 5%(:0.05) significance level for 2 degrees of freedom = 5.991
e.
As value of the test statistic : 21.6938 > Critical value of : 5.991. Reject the null hypothesis.
There is sufficient evidence to suggest that it does not appear that the samples is representative of the population.