Question

In: Statistics and Probability

A dataset with filename ‘tv.txt’. The data arise from a study examining the time teenagers spend...

A dataset with filename ‘tv.txt’. The data arise from a study examining the time teenagers spend watching TV. A random sample of n = 100 eighth grade American high school students was obtained, and the number of minutes spend watching TV during the first week of October was recorded. A similar sample of m = 90 Canadian students was also obtained. In this study it is of interest to compare the TV watching habits of the teenagers from the two countries, specifically to determine if Canadian students watch less TV than their American counterparts.The arguments to this function are: x1, a vector containing the sample measurements from the first population; x2, a vector containing the sample measurements from the second population; and H1, a string variable which takes one of three possible values: “two.sided”, “less” or “greater” specifying the alternative hypothesis. Note that x1 and x2 may have missing values, which should be removed in the function to compute the test statistic. To complete this question you will need to use R function pnorm() which computes standard normal probabilities.

(c)Apply your function to the TV data, computing the p-values for each of the three possible alternative hypotheses.

(d)Which of the three alternative hypotheses is relevant for the particular question being asked in this study? Comment on the results.

tv.txt

1 74 65
2 95 68
3 93 60
4 39 67
5 75 62
6 92 64
7 34 73
8 73 64
9 78 70
10 59 58
11 68 66
12 56 60
13 49 74
14 71 70
15 58 60
16 91 68
17 34 64
18 80 61
19 59 68
20 36 62
21 52 56
22 59 61
23 80 74
24 37 65
25 85 71
26 63 78
27 69 67
28 61 81
29 85 68
30 52 73
31 54 66
32 53 76
33 67 72
34 64 57
35 80 65
36 65 77
37 50 65
38 64 64
39 64 59
40 79 65
41 29 62
42 54 73
43 62 70
44 78 74
45 66 60
46 81 67
47 67 69
48 63 73
49 59 65
50 60 65
51 57 77
52 81 61
53 84 66
54 67 68
55 54 64
56 81 67
57 76 71
58 63 69
59 53 61
60 77 64
61 46 74
62 48 69
63 46 62
64 61 72
65 79 72
66 66 63
67 59 67
68 68 68
69 63 63
70 67 75
71 74 78
72 66 57
73 70 65
74 77 71
75 57 65
76 52 84
77 53 69
78 65 63
79 72 66
80 43 67
81 83 77
82 64 60
83 83 69
84 75 62
85 48 66
86 78 82
87 54 69
88 67 67
89 61 59
90 52 64
91 NA 69
92 NA 75
93 NA 55
94 NA 63
95 NA 68
96 NA 72
97 NA 71
98 NA 66
99 NA 70
100 NA 66

Expert Solution

This is a relatively simpler question involving comparison of two different means.

The american student sample consist of sample size 100 and that of canadian students is of 90 sample size.

Occurance	Canadian	American
1	74	65
2	95	68
3	93	60
4	39	67
5	75	62
6	92	64
7	34	73
8	73	64
9	78	70
10	59	58
11	68	66
12	56	60
13	49	74
14	71	70
15	58	60
16	91	68
17	34	64
18	80	61
19	59	68
20	36	62
21	52	56
22	59	61
23	80	74
24	37	65
25	85	71
26	63	78
27	69	67
28	61	81
29	85	68
30	52	73
31	54	66
32	53	76
33	67	72
34	64	57
35	80	65
36	65	77
37	50	65
38	64	64
39	64	59
40	79	65
41	29	62
42	54	73
43	62	70
44	78	74
45	66	60
46	81	67
47	67	69
48	63	73
49	59	65
50	60	65
51	57	77
52	81	61
53	84	66
54	67	68
55	54	64
56	81	67
57	76	71
58	63	69
59	53	61
60	77	64
61	46	74
62	48	69
63	46	62
64	61	72
65	79	72
66	66	63
67	59	67
68	68	68
69	63	63
70	67	75
71	74	78
72	66	57
73	70	65
74	77	71
75	57	65
76	52	84
77	53	69
78	65	63
79	72	66
80	43	67
81	83	77
82	64	60
83	83	69
84	75	62
85	48	66
86	78	82
87	54	69
88	67	67
89	61	59
90	52	64
91	NA	69
92	NA	75
93	NA	55
94	NA	63
95	NA	68
96	NA	72
97	NA	71
98	NA	66
99	NA	70
100	NA	66
Mean	65	67
Std Dev	14.27518	5.847386

Based on the data provided, the following summary can be inferred for the two samples.

	Canadian	American
Mean(µ)	64.51111	67.3
Std Dev (ơ)	14.27518	5.847386
n	90	100

Now we need to check if the two samples are similar to each other or not ?

To do so we shall run hypothesis testing on the two as follows.

where $Ho\mu_{c}= mean of canadian students & \mu_{a} =american student mean$ \

Now we shall find thew test statistic given by

Difference Scores Calculations

Canadian Sample

N₁: 90
df₁ = N - 1 = 90 - 1 = 89
M₁: 64.51
SS₁: 18136.49
s²₁ = SS₁/(N - 1) = 18136.49/(90-1) = 203.78

American Sample

N₂: 100
df₂ = N - 1 = 100 - 1 = 99
M₂: 67.3
SS₂: 3385
s²₂ = SS₂/(N - 1) = 3385/(100-1) = 34.19

T-value Calculation

s²_p = ((df₁/(df₁ + df₂)) * s²₁) + ((df₂/(df₂ + df₂)) * s²₂) = ((89/188) * 203.78) + ((99/188) * 34.19) = 114.48

s²_M₁ = s²_p/N₁ = 114.48/90 = 1.27
s²_M₂ = s²_p/N₂ = 114.48/100 = 1.14

t = (M₁ - M₂)/√(s²_M₁ + s²_M₂) = -2.79/√2.42 = -1.79

The t-value is -1.79398. The p-value is .037211. The result is significant at p < .05.

Thus we can say that the means of the two samples do not differ significently to each other and our null hypothesis is accepted.

orchestra answered 2 years ago

The following data are from a repeated-measures study examining the effect of a treatment by measuring...

The following data are from a repeated-measures study examining the effect of a treatment by measuring a group of n=5 participants before and after they receive the treatment. A.) calculate the difference scores and MD B.) compute SS, sample variance, and estimated standard error. C.) is there a significant treatment effect? Use Alpha=0.05 Participant Before Treatment After Treatment A 8 7 B 7 5 C 6 6 D 7 6 E 9 7

Consider the prostate dataset containing data from a study on 97 men with prostate cancer. You...

Consider the prostate dataset containing data from a study on 97 men with prostate cancer. You will have to install the 'faraway' package and use the 'prostate' dataset for Questions 1-5. Description of dataset Lcavol - log(cancer volume) Lweight - log(prostate weight) Age - age Lbph - log(benign prostatic hyperplasia amount) Svi - seminal vesicle invasion Lcp - log(capsular penetration) Gleason - Gleason score Pgg45 - percentage Gleason scores 4 or 5 Lpsa - log(prostate specific antigen) Build a KNN...

n addition to supply and demand, price theories arise from examining the profit desires of individual...

n addition to supply and demand, price theories arise from examining the profit desires of individual firms. Consider a firm with some market power that is not earning an economic profit, nor earning an economic loss. Suppose this firm is able to decrease its costs. Show how the firm would best respond to this decrease in costs, and explain why the firm will tend to benefit in the short run. Then, assume other firms enter this market with very close...

In a study examining the effect of alcohol on reaction time, Liguori and Robinson (2001) found...

In a study examining the effect of alcohol on reaction time, Liguori and Robinson (2001) found that even moderate alcohol consumption significantly slowed response time to an emergency situation in a driving simulation. In a similar study, researchers measured reaction time 30 minutes after participants consumed one 6-ounce glass of wine. Again, they used a standardized driving simulation task for which the regular population averages the u = 400 msec. The distribution of reaction times is approximately normal with o...

In a study examining the effect of caffeine on reaction time, Liguori and Robinson (2001) measured...

In a study examining the effect of caffeine on reaction time, Liguori and Robinson (2001) measured reaction time 30 minutes after participants consumed one 6-ounce cup of coffee and they used standardized driving simulation task for which the regular population averages μ = 415 msec. The distribution of reaction times is approximately normal with σ = 40. Assume that the researcher obtained a sample mean of M = 400 msec for the n = 36 participants in the study. Can...

A newspaper director intends to study the amount of time that the readers spend on reading...

A newspaper director intends to study the amount of time that the readers spend on reading the newspaper. She picks a sample of 28 individuals who read the newspapers on-line. The mean sample turns out to be 20, with standard deviation of 6 hours per month. Another sample of 35 individuals who read the newspapers in paper form. The mean sample for this group is 24, with standard deviation of 8 hours per month. Given these two samples, run a...

A replication study dataset of the example from this chapter is given as follows (A =...

A replication study dataset of the example from this chapter is given as follows (A = attractiveness, B = time; same levels). Using the scores from the individual cells of the model that follow, conduct a two-factor fixed-effects ANOVA (alpha = .05). Are the results different as compared to the original dataset? A1B1: 10, 8, 7, 3 A1B2: 15, 12, 21, 13 A2B1: 13, 9, 18, 12 A2B2: 20, 22, 24, 25

A study examined parental influence on the decisions of teenagers from a certain large region to...

A study examined parental influence on the decisions of teenagers from a certain large region to smoke. A randomly selected group of students, from the region, who had never smoked were questioned about their parents' attitudes toward smoking. These students were questioned again two years later to see if they had started smoking. The researchers found that, among the 263 students who indicated that their parents disapproved of kids smoking, 53 had become established smokers. Among the 43 students who...

How much time and effort should we spend on a feasibility study – what is it...

How much time and effort should we spend on a feasibility study – what is it dependant on?

The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In...

The dataset ’anorexia’ in the MASS package in R-Studio contains data for an anorexia study. In the study, three treatments (Treat) were applied to groups of young female anorexia patients, and their weights before (Prewt) and after (Postwt) treatment were recorded. The three treatments adminstered were no treatment (Cont), Cognitive Behavioural treatment (CBT), and family treatment (FT). Determine at the 5% significance level if there is a difference in mean weight gain between those receiving no treatment and those receiving...