In: Statistics and Probability
A dataset with filename ‘tv.txt’. The data arise from a study examining the time teenagers spend watching TV. A random sample of n = 100 eighth grade American high school students was obtained, and the number of minutes spend watching TV during the first week of October was recorded. A similar sample of m = 90 Canadian students was also obtained. In this study it is of interest to compare the TV watching habits of the teenagers from the two countries, specifically to determine if Canadian students watch less TV than their American counterparts.The arguments to this function are: x1, a vector containing the sample measurements from the first population; x2, a vector containing the sample measurements from the second population; and H1, a string variable which takes one of three possible values: “two.sided”, “less” or “greater” specifying the alternative hypothesis. Note that x1 and x2 may have missing values, which should be removed in the function to compute the test statistic. To complete this question you will need to use R function pnorm() which computes standard normal probabilities.
(c)Apply your function to the TV data, computing the p-values for each of the three possible alternative hypotheses.
(d)Which of the three alternative hypotheses is relevant for the particular question being asked in this study? Comment on the results.
tv.txt
1 74 65
2 95 68
3 93 60
4 39 67
5 75 62
6 92 64
7 34 73
8 73 64
9 78 70
10 59 58
11 68 66
12 56 60
13 49 74
14 71 70
15 58 60
16 91 68
17 34 64
18 80 61
19 59 68
20 36 62
21 52 56
22 59 61
23 80 74
24 37 65
25 85 71
26 63 78
27 69 67
28 61 81
29 85 68
30 52 73
31 54 66
32 53 76
33 67 72
34 64 57
35 80 65
36 65 77
37 50 65
38 64 64
39 64 59
40 79 65
41 29 62
42 54 73
43 62 70
44 78 74
45 66 60
46 81 67
47 67 69
48 63 73
49 59 65
50 60 65
51 57 77
52 81 61
53 84 66
54 67 68
55 54 64
56 81 67
57 76 71
58 63 69
59 53 61
60 77 64
61 46 74
62 48 69
63 46 62
64 61 72
65 79 72
66 66 63
67 59 67
68 68 68
69 63 63
70 67 75
71 74 78
72 66 57
73 70 65
74 77 71
75 57 65
76 52 84
77 53 69
78 65 63
79 72 66
80 43 67
81 83 77
82 64 60
83 83 69
84 75 62
85 48 66
86 78 82
87 54 69
88 67 67
89 61 59
90 52 64
91 NA 69
92 NA 75
93 NA 55
94 NA 63
95 NA 68
96 NA 72
97 NA 71
98 NA 66
99 NA 70
100 NA 66
This is a relatively simpler question involving comparison of two different means.
The american student sample consist of sample size 100 and that of canadian students is of 90 sample size.
Occurance | Canadian | American |
1 | 74 | 65 |
2 | 95 | 68 |
3 | 93 | 60 |
4 | 39 | 67 |
5 | 75 | 62 |
6 | 92 | 64 |
7 | 34 | 73 |
8 | 73 | 64 |
9 | 78 | 70 |
10 | 59 | 58 |
11 | 68 | 66 |
12 | 56 | 60 |
13 | 49 | 74 |
14 | 71 | 70 |
15 | 58 | 60 |
16 | 91 | 68 |
17 | 34 | 64 |
18 | 80 | 61 |
19 | 59 | 68 |
20 | 36 | 62 |
21 | 52 | 56 |
22 | 59 | 61 |
23 | 80 | 74 |
24 | 37 | 65 |
25 | 85 | 71 |
26 | 63 | 78 |
27 | 69 | 67 |
28 | 61 | 81 |
29 | 85 | 68 |
30 | 52 | 73 |
31 | 54 | 66 |
32 | 53 | 76 |
33 | 67 | 72 |
34 | 64 | 57 |
35 | 80 | 65 |
36 | 65 | 77 |
37 | 50 | 65 |
38 | 64 | 64 |
39 | 64 | 59 |
40 | 79 | 65 |
41 | 29 | 62 |
42 | 54 | 73 |
43 | 62 | 70 |
44 | 78 | 74 |
45 | 66 | 60 |
46 | 81 | 67 |
47 | 67 | 69 |
48 | 63 | 73 |
49 | 59 | 65 |
50 | 60 | 65 |
51 | 57 | 77 |
52 | 81 | 61 |
53 | 84 | 66 |
54 | 67 | 68 |
55 | 54 | 64 |
56 | 81 | 67 |
57 | 76 | 71 |
58 | 63 | 69 |
59 | 53 | 61 |
60 | 77 | 64 |
61 | 46 | 74 |
62 | 48 | 69 |
63 | 46 | 62 |
64 | 61 | 72 |
65 | 79 | 72 |
66 | 66 | 63 |
67 | 59 | 67 |
68 | 68 | 68 |
69 | 63 | 63 |
70 | 67 | 75 |
71 | 74 | 78 |
72 | 66 | 57 |
73 | 70 | 65 |
74 | 77 | 71 |
75 | 57 | 65 |
76 | 52 | 84 |
77 | 53 | 69 |
78 | 65 | 63 |
79 | 72 | 66 |
80 | 43 | 67 |
81 | 83 | 77 |
82 | 64 | 60 |
83 | 83 | 69 |
84 | 75 | 62 |
85 | 48 | 66 |
86 | 78 | 82 |
87 | 54 | 69 |
88 | 67 | 67 |
89 | 61 | 59 |
90 | 52 | 64 |
91 | NA | 69 |
92 | NA | 75 |
93 | NA | 55 |
94 | NA | 63 |
95 | NA | 68 |
96 | NA | 72 |
97 | NA | 71 |
98 | NA | 66 |
99 | NA | 70 |
100 | NA | 66 |
Mean | 65 | 67 |
Std Dev | 14.27518 | 5.847386 |
Based on the data provided, the following summary can be inferred for the two samples.
Canadian | American | |
Mean(µ) | 64.51111 | 67.3 |
Std Dev (ơ) | 14.27518 | 5.847386 |
n | 90 | 100 |
Now we need to check if the two samples are similar to each other or not ?
To do so we shall run hypothesis testing on the two as follows.
where \
Now we shall find thew test statistic given by
Difference Scores
Calculations
Canadian Sample
N1: 90
df1 = N - 1 = 90 - 1 = 89
M1: 64.51
SS1: 18136.49
s21 =
SS1/(N - 1) = 18136.49/(90-1) =
203.78
American Sample
N2: 100
df2 = N - 1 = 100 - 1 = 99
M2: 67.3
SS2: 3385
s22 =
SS2/(N - 1) = 3385/(100-1) =
34.19
T-value
Calculation
s2p =
((df1/(df1 +
df2)) * s21) +
((df2/(df2 +
df2)) * s22) =
((89/188) * 203.78) + ((99/188) * 34.19) = 114.48
s2M1 =
s2p/N1
= 114.48/90 = 1.27
s2M2 =
s2p/N2
= 114.48/100 = 1.14
t = (M1 -
M2)/√(s2M1
+ s2M2) =
-2.79/√2.42 = -1.79
The t-value is -1.79398. The p-value is .037211. The result is significant at p < .05.
Thus we can say that the means of the two samples do not differ significently to each other and our null hypothesis is accepted.