In: Statistics and Probability
Do male and female skiers differ in their tendency to use a ski
helmet? Ruzic and Tudor (2011) report a study in which 710 skiers
completed a survey about aspects of their skiing habits. Suppose
the results from the question on the survey about ski helmet usage
were as follows:
Never | Occasionally | Always | |
Male | 243 | 71 | 180 |
Female | 105 | 33 | 78 |
Part a)
Which of the following null hypotheses could sensibly be tested by
the data presented above?
A. There is no relationship between Gender and
Helmet usage.
B. The mean number of male skiers who never wear a
ski helmet is the same as the mean number of female skiers who
never use one.
C. The proportions of males and females skiers are
equal.
D. Male and female skiers are as likely to never
use a ski helmet as always use one.
Part b)
Under the null hypothesis, what is the expected number of men in
the survey who never wear a ski helmet?
Give your answer to 2 decimal places.
Part c)
Perform a suitable test on the data above to test the null
hypothesis.
Provide the value of your test statistic to 2 decimal places.
Part d)
Under the null hypothesis, the test statistic should be an
observation from which probability distribution?
A. The F3,707 distribution.
B. The tt distribution on two degrees of
freedom.
C. The Chi-squared distribution on five degrees of
freedom.
D. The Chi-squared distribution on four degrees of
freedom.
E. The standard Normal distribution.
F. The Chi-squared distribution on two degrees of
freedom.
Part e)
Would you reject or not reject your null hypothesis at the 5 %
significance level?
A. Not reject
B. Reject
Part a)
Which of the following null hypotheses could sensibly be tested by
the data presented above?
A. There is no relationship between Gender and Helmet usage.
Part b)
Under the null hypothesis, what is the expected number of men in
the survey who never wear a ski helmet?
Expected Values | Never | Ocasionally | Always | Total |
Male | 494 | |||
Female | 216 | |||
Total | 348 | 104 | 258 | 710 |
Part c)
Perform a suitable test on the data above to test the null
hypothesis.
Chi square test for independence
Part d)
Under the null hypothesis, the test statistic should be an
observation from which probability distribution?
F. The Chi-squared distribution on two degrees of freedom.
Part e)
Would you reject or not reject your null hypothesis at the 5 %
significance level?
A. Not reject
The following cross table have been provided. The row and column total have been calculated and they are shown below:
Never | Ocasionally | Always | Total | |
Male | 243 | 71 | 180 | 494 |
Female | 105 | 33 | 78 | 216 |
Total | 348 | 104 | 258 | 710 |
The expected values are computed in terms of row and column totals. In fact, the formula is , where R_i corresponds to the total sum of elements in row i, C_j corresponds to the total sum of elements in column j, and T is the grand total. The table below shows the calculations to obtain the table with expected values:
Expected Values | Never | Ocasionally | Always | Total |
Male | 494 | |||
Female | 216 | |||
Total | 348 | 104 | 258 | 710 |
Based on the observed and expected values, the squared distances can be computed according to the following formula: (E - O)^2/E. The table with squared distances is shown below:
Squared Distances | Never | Ocasionally | Always |
Male | |||
Female |
Null and Alternative Hypotheses
The following null and alternative hypotheses need to be tested:
H_0 : The two variables are independent
H_a: The two variables are dependent
This corresponds to a Chi-Square test of independence.
Rejection Region
Based on the information provided, the significance level is α=0.05 , the number of degrees of freedom is df = (2 - 1)*(3 - 1) = 2.
Test Statistics
The Chi-Squared statistic is computed as follows:
Decision about the null hypothesis
Since it is observed that = 0.099 < = 5.991, it is then concluded that the null hypothesis is not rejected.
Conclusion
It is concluded that the null hypothesis Ho is not rejected. Therefore, there is NOT enough evidence to claim that the two variables are dependent, at the 0.05 significance level.