In: Statistics and Probability
A psychologist conducted an experiment analysing the relationship between student scores in an exam and the amount of attention they paid in class. The latter was measured using a type of brain monitor. The Psychologist believed that scores would increase by 1 for every two unit increase in attention. The data are listed in the excel spreadsheet.
Estimate a linear regression between the score (Y) and the measure of attention(X).
(a) Write out the equation for Y in the form , but with coefficients. Show the estimated standard errors in parenthesis below the coefficients. What is the R2 of the regression? Calculate a 99 percent confidence interval for β. [5 pts]
(b) What are the mean and the estimated standard deviation of the estimated residuals? [2 pts]
Hint: the first answer is definitional and the second answer is easily seen from the output.
(c )Test the hypothesis that there is no relationship between the variables at the 90 percent significance level. [3 pts]
(d) Test the hypothesis that the coefficient β=0.5 at the 99% significance level. [3 pts]
(e) The Psychologist concluded from the experiment that test scores increase significantly if students pay attention in class. In one word, how would you describe the results of this experiment based on the data you have? [2 pts]
DATA:
Regression data for Psychology Experiment | |||
Attention | Score | ||
18 | 80 | ||
35 | 90 | ||
86 | 80 | ||
22 | 50 | ||
72 | 76 | ||
102 | 74 | ||
86 | 75 | ||
30 | 80 | ||
35 | 85 | ||
94 | 82 | ||
16 | 80 | ||
42 | 41 | ||
50 | 50 | ||
96 | 96 | ||
60 | 80 | ||
106 | 70 | ||
80 | 65 | ||
14 | 14 | ||
11 | 14 | ||
80 | 85 | ||
12 | 14 | ||
37 | 43 | ||
26 | 80 | ||
86 | 70 | ||
5 | 20 | ||
17 | 20 | ||
35 | 80 | ||
76 | 68 | ||
50 | 70 | ||
15 | 16 | ||
90 | 86 | ||
96 | 80 | ||
7 | 16 | ||
10 | 14 | ||
35 | 65 | ||
88 | 88 | ||
20 | 32 | ||
22 | 70 | ||
50 | 65 | ||
22 | 62 | ||
35 | 50 | ||
64 | 92 | ||
68 | 84 | ||
13 | 15 | ||
102 | 102 | ||
86 | 85 | ||
18 | 24 | ||
78 | 64 | ||
98 | 78 | ||
70 | 80 | ||
60 | 70 | ||
98 | 98 | ||
9 | 14 | ||
50 | 90 | ||
104 | 72 | ||
35 | 45 | ||
60 | 60 | ||
74 | 72 | ||
88 | 88 | ||
80 | 95 | ||
22 | 58 | ||
8 | 14 | ||
86 | 110 | ||
60 | 75 | ||
92 | 84 | ||
60 | 100 | ||
80 | 75 | ||
86 | 95 | ||
16 | 18 | ||
86 | 90 | ||
35 | 75 | ||
35 | 60 | ||
80 | 60 | ||
80 | 70 | ||
104 | 104 | ||
80 | 100 | ||
60 | 90 | ||
86 | 100 | ||
62 | 96 | ||
60 | 65 | ||
39 | 41 | ||
50 | 80 | ||
50 | 75 | ||
6 | 18 | ||
60 | 95 | ||
22 | 54 | ||
21 | 40 | ||
100 | 100 | ||
94 | 94 | ||
80 | 90 | ||
48 | 41 | ||
106 | 106 | ||
50 | 43 | ||
46 | 41 | ||
90 | 90 | ||
60 | 85 | ||
92 | 92 | ||
22 | 80 | ||
35 | 70 | ||
66 | 88 | ||
80 | 60 | ||
50 | 60 | ||
80 | 80 | ||
100 | 76 | ||
50 | 45 | ||
86 | 65 | ||
19 | 28 | ||
50 | 85 | ||
22 | 75 | ||
86 | 105 |
a)
ΣX | ΣY | Σ(x-x̅)² | Σ(y-ȳ)² | Σ(x-x̅)(y-ȳ) | |
total sum | 6262 | 7445 | 100985.4182 | 74369.9 | 63083.45 |
mean | 56.93 | 67.68 | SSxx | SSyy | SSxy |
sample size , n = 110
here, x̅ = Σx / n= 56.93 ,
ȳ = Σy/n = 67.68
SSxx = Σ(x-x̅)² = 100985.4182
SSxy= Σ(x-x̅)(y-ȳ) = 63083.5
estimated slope , ß1 = SSxy/SSxx = 63083.5
/ 100985.418 = 0.6247
intercept, ß0 = y̅-ß1* x̄ =
32.1206
so, regression line is Ŷ =
32.1206 + 0.6247 *x
(3.65) (0.057)
SSE= (SSxx * SSyy - SS²xy)/SSxx =
34962.964
std error ,Se = √(SSE/(n-2)) =
17.993
correlation coefficient , r = Sxy/√(Sx.Sy)
= 0.7279
R² = (Sxy)²/(Sx.Sy) =
0.5299
α= 0.01
t critical value= t α/2 =
1.982 [excel function: =t.inv.2t(α/2,df) ]
estimated std error of slope = Se/√Sxx =
17.99253 /√ 100985.42
= 0.057
margin of error ,E= t*std error = 1.982
* 0.057 = 0.112
estimated slope , ß^ = 0.6247
lower confidence limit = estimated slope - margin of error
= 0.6247 - 0.112
= 0.5124
upper confidence limit=estimated slope + margin of error
= 0.6247 + 0.112
= 0.7369
B)
ANOVA | |||||
df | SS | MS | F | Significance F | |
Regression | 1 | 39406.9 | 39406.9 | 121.7272 | 2.08E-19 |
Residual | 108 | 34962.96 | 323.7311 | ||
Total | 109 | 74369.86 |
c)
slope hypothesis test
tail= 2
Ho: ß1= 0
H1: ß1╪ 0
n= 110
alpha = 0.1
estimated std error of slope =Se(ß1) = Se/√Sxx =
17.993 /√ 100985.42
= 0.0566
t stat = estimated slope/std error =ß1 /Se(ß1) =
0.6247 / 0.0566 =
11.0330
t-critical value= 1.6591 [excel function:
=T.INV.2T(α,df) ]
Degree of freedom ,df = n-2= 108
p-value = 0.0000
decison : p-value<α , reject Ho
Conclusion: Reject Ho and conclude that slope is
significantly different from zero
e)
Relationship is moderate positive relatioship. It means although score are increasing but not at the rate expected.
for each 0.6247 increase in attention, score is increaseed by 1.
Please revert back in case of any doubt.
Please upvote. Thanks in advance.