In: Statistics and Probability
Using PROC FORMAT and PROC FREQ for following data:
(a) Define an appropriate format for the gender variable.
(b) Produce a 2 X 2 table with gender as the rows and lenses as the columns.
(c) Calculate the relative risk and provide a one sentence written interpretation explicitly stating which groups are being compared and defining the outcome.
(d) Perform a chi-squared test of association between gender and needing contact lenses. What are the results of the test (i.e. do you reject the hypothesis?)
Obs |
id |
carrot |
gender |
latitude |
lenses |
1 |
1 |
0 |
1 |
33 |
1 |
2 |
2 |
0 |
2 |
46 |
1 |
3 |
3 |
1 |
1 |
32 |
1 |
4 |
4 |
0 |
2 |
26 |
0 |
5 |
5 |
1 |
1 |
25 |
1 |
6 |
6 |
1 |
2 |
48 |
0 |
7 |
7 |
0 |
1 |
39 |
1 |
8 |
8 |
0 |
2 |
24 |
0 |
9 |
9 |
0 |
1 |
35 |
0 |
10 |
10 |
0 |
1 |
42 |
1 |
11 |
11 |
1 |
1 |
35 |
0 |
12 |
12 |
0 |
2 |
44 |
0 |
13 |
13 |
1 |
1 |
35 |
1 |
14 |
14 |
1 |
1 |
25 |
0 |
15 |
15 |
1 |
1 |
24 |
0 |
16 |
16 |
1 |
1 |
38 |
0 |
17 |
17 |
1 |
1 |
28 |
1 |
18 |
18 |
0 |
1 |
43 |
0 |
19 |
19 |
1 |
1 |
44 |
0 |
20 |
20 |
0 |
1 |
46 |
1 |
21 |
21 |
0 |
1 |
37 |
1 |
22 |
22 |
0 |
2 |
33 |
0 |
23 |
23 |
0 |
2 |
42 |
1 |
24 |
24 |
1 |
2 |
31 |
1 |
25 |
25 |
0 |
2 |
46 |
1 |
26 |
26 |
0 |
1 |
32 |
1 |
27 |
27 |
0 |
2 |
30 |
0 |
28 |
28 |
0 |
2 |
27 |
1 |
29 |
29 |
1 |
1 |
45 |
0 |
30 |
30 |
1 |
1 |
39 |
0 |
31 |
31 |
0 |
2 |
47 |
1 |
32 |
32 |
1 |
1 |
39 |
0 |
33 |
33 |
1 |
1 |
48 |
1 |
34 |
34 |
0 |
1 |
47 |
0 |
35 |
35 |
0 |
1 |
32 |
0 |
36 |
36 |
0 |
1 |
31 |
0 |
37 |
37 |
1 |
2 |
26 |
1 |
38 |
38 |
0 |
2 |
28 |
1 |
39 |
39 |
0 |
1 |
25 |
1 |
40 |
40 |
1 |
2 |
25 |
0 |
41 |
41 |
1 |
1 |
31 |
0 |
42 |
42 |
1 |
2 |
47 |
1 |
43 |
43 |
1 |
1 |
32 |
1 |
44 |
44 |
1 |
2 |
24 |
1 |
45 |
45 |
1 |
2 |
37 |
0 |
46 |
46 |
1 |
2 |
26 |
0 |
47 |
47 |
0 |
2 |
41 |
1 |
48 |
48 |
0 |
2 |
43 |
1 |
49 |
49 |
0 |
1 |
45 |
1 |
50 |
50 |
0 |
1 |
27 |
1 |
51 |
51 |
1 |
1 |
31 |
0 |
52 |
52 |
0 |
2 |
40 |
0 |
53 |
53 |
0 |
2 |
37 |
0 |
54 |
54 |
1 |
2 |
48 |
0 |
55 |
55 |
0 |
2 |
26 |
0 |
56 |
56 |
0 |
2 |
33 |
1 |
57 |
57 |
0 |
1 |
48 |
1 |
58 |
58 |
1 |
2 |
24 |
1 |
59 |
59 |
0 |
1 |
32 |
1 |
60 |
60 |
1 |
1 |
40 |
1 |
61 |
61 |
0 |
2 |
45 |
0 |
62 |
62 |
1 |
1 |
40 |
0 |
63 |
63 |
0 |
1 |
36 |
1 |
64 |
64 |
0 |
2 |
42 |
0 |
65 |
65 |
1 |
2 |
44 |
0 |
66 |
66 |
0 |
1 |
44 |
1 |
67 |
67 |
1 |
2 |
47 |
0 |
68 |
68 |
1 |
2 |
27 |
1 |
69 |
69 |
1 |
1 |
33 |
1 |
70 |
70 |
0 |
1 |
29 |
1 |
71 |
71 |
0 |
1 |
42 |
0 |
72 |
72 |
1 |
1 |
40 |
0 |
73 |
73 |
0 |
2 |
44 |
1 |
74 |
74 |
1 |
2 |
41 |
0 |
75 |
75 |
1 |
2 |
26 |
1 |
76 |
76 |
1 |
2 |
27 |
0 |
77 |
77 |
0 |
2 |
29 |
1 |
78 |
78 |
0 |
1 |
33 |
1 |
79 |
79 |
1 |
2 |
31 |
1 |
80 |
80 |
1 |
2 |
33 |
0 |
81 |
81 |
1 |
1 |
43 |
1 |
82 |
82 |
1 |
2 |
33 |
1 |
83 |
83 |
0 |
2 |
43 |
1 |
84 |
84 |
0 |
1 |
39 |
1 |
85 |
85 |
1 |
2 |
47 |
0 |
86 |
86 |
1 |
1 |
46 |
1 |
87 |
87 |
1 |
2 |
27 |
0 |
88 |
88 |
1 |
2 |
38 |
0 |
89 |
89 |
1 |
1 |
34 |
0 |
90 |
90 |
1 |
1 |
40 |
0 |
91 |
91 |
1 |
1 |
27 |
1 |
92 |
92 |
0 |
1 |
29 |
1 |
93 |
93 |
1 |
1 |
43 |
1 |
94 |
94 |
0 |
1 |
40 |
0 |
95 |
95 |
1 |
1 |
31 |
0 |
96 |
96 |
1 |
2 |
38 |
0 |
97 |
97 |
0 |
2 |
30 |
1 |
98 |
98 |
1 |
2 |
26 |
0 |
99 |
99 |
0 |
1 |
43 |
1 |
100 |
100 |
0 |
2 |
33 |
1 |
(a) Gender: Gender is nominal variable having two categories of male and female.
1 represents male and 2 represents female.
(b) Gender has two categories and lenses have also two categories.
Gender: 1 - male
2 - female
Lenses: 1 - Need lense
0 - No need of lense
The 2x2 contingency table for gender and lenses is:
Lenses | |||
Need lense | No need lense | ||
Gender | Male | 30 | 22 |
Female | 23 | 25 |
(c) Relative risk
The formula of relative risk is:
From contingency table A = 30, B = 22, C = 23 and D = 25
By plugging the values in the formula of relative risk:
Interpretation: Male has 1.2 times the risk of having or needing lenses than female.
(d) Chi-squares test of association.
The null and alternative hypothesis are:
H0: There is no association between gender and needing contact lenses.
H1: There is an association between gender and needing contact lenses.
Chi-squared test statistics:
The formula of chi-square test statistics is:
Where O - observed frequencies which are given in contingency table
and E - Expected frequencies.
Here have to find the expected frequencies.
Let us first find the expected value for the first cell that is for 30, the first row total is 52 and first column total is 53 and the overall that is grand total is 100
So the first cell expected value is:
The expected value for the first row and second column is:
The expected value for the second row and the first column is:
The expected value for the second row and second column is:
The table for calculation of test statistics is:
O | E | (O-E)2 | (O-E)2/E |
30 | 27.56 | 5.9536 | 0.216023 |
22 | 24.44 | 5.9536 | 0.243601 |
23 | 25.44 | 5.9536 | 0.234025 |
25 | 22.56 | 5.9536 | 0.263901 |
Test statistics is nothing but the sum of the last column.
Chi-square critical value:
To find critical value we need an alpha that is level of significance and degrees of freedom.
Alpha = 0.05 and
Degrees of freedom = (number or row - 1)*(number of column - 1) = (2 - 1)*(2 - 1) = 1
By using Chi square table the critical value for area 0.05 and degrees of freedom 1 is 3.841
Decision rule:
If critical value > test statistics then we fail to reject the null hypothesis otherwise reject the null hypothesis.
Here critical value(3.841) is > test statistics(0.96) so we fail to reject the null hypothesis.
Conclusion: That is there is no association between gender and needing contact lenses.