In: Statistics and Probability
Problem 1
The following table lists data on the participation scores and midterm exam scores for 23 students who took the Accounting Theory class in Spring 2019.
Student Number |
Participation Score |
Midterm Score |
1 |
5 |
75 |
2 |
6.5 |
84 |
3 |
6.5 |
73 |
4 |
7 |
96 |
5 |
7.5 |
83 |
6 |
7.5 |
88 |
7 |
7.5 |
75 |
8 |
8 |
75 |
9 |
8.5 |
82.5 |
10 |
8.5 |
89 |
11 |
8.5 |
90 |
12 |
8.5 |
91 |
13 |
9 |
92 |
14 |
9 |
81.5 |
15 |
9 |
95 |
16 |
9 |
88 |
17 |
9 |
89.5 |
18 |
9 |
93 |
19 |
9.5 |
90 |
20 |
10 |
99 |
21 |
10 |
87.5 |
22 |
10 |
88 |
23 |
10 |
81 |
Solution
Part (1)
NOTE: Final answers are given below. Back-up Theory and Details of calculations follow at the end.
Regression model:
Mid-term score (y) = 62.6070 + (2.8292 x Participation score) (x) Answer 1
Part (2)
Substituting x = 5, 7 and 8 in the above regression equation we get:
Expected midterm scores for students whose participation scores are 5 is: 76.75 Answer 2
Expected midterm scores for students whose participation scores are 7 is: 82.41 Answer 3
Expected midterm scores for students whose participation scores are 8 is: 85.24 Answer 4 Part (3)
Since participation scores for student numbers 1, 4 and 8 are 5, 7 and 8 respectively,
Expected mid-term scores for student numbers 1, 4 and 8 are respectively:
76.75, 82.41 and 85.24 Answer 5
Part (4)
Comparison
Student # |
Expected score (E) |
Actual score (A) |
Difference E – A |
1 |
76.75 |
75 |
1.75 |
4 |
82.41 |
96. |
- 13.59 |
8 |
85.24 |
75 |
10.24 |
In two cases, E > A and in one case A > E.
The difference is quite significant for students 4 and 8.
Explanations for the difference
1. The model assumed (simple linear) is not adequate to predict. This is quite possible since the correlation coefficient is only 0.51.
2. Assumption made for the linear model such as error terms are iid N(0, σ2), independence of x with error term, etc are not fulfilled.
3. Data recording is not reliable.
Answer 6
Back-up Theory and Details of calculations
The linear regression model: Y = β0 + β1X + ε, ……………………..............................................................…………..(1)
where ε is the error term, which is assumed to be Normally distributed with mean 0 and variance σ2.
Estimated Regression of Y on X is given by: Yhat = β0hat + β1hatX, …………......................................................…….(2)
β1cap = Sxy/Sxx and β0cap = Ybar – β1cap.Xbar.............………………...............................................................….…..(3)
where
[Mean X = Xbar = (1/n) Σ(i = 1 to n)xi ; Mean Y = Ybar = (1/n) Σ(i = 1 to n)yi Sxx = Σ(i = 1 to n)(xi – Xbar)2
Syy = Σ(i = 1 to n)(yi – Ybar)2 ; Sxy = Σ(i = 1 to n){(xi – Xbar)(yi – Ybar)} ] ................................................................ (4)
n |
23 |
Xbar |
8.3913 |
ybar |
86.3478 |
Sxx |
36.9783 |
Syy |
1127.2174 |
Sxy |
104.6196 |
β1cap |
2.8292 |
β0cap |
62.6070 |
r |
0.5124 |
DONE