In: Statistics and Probability
A psychology instructor wants to find out a suitable predictor of the Final examination marks of his students. He thinks that the Assignment marks or the Mid-term test marks can be used for this purpose. However he is not sure which of those is more suitable. The following table shows the Assignment marks (out of 20), Mid-term test marks (out of 20) and the Final examination marks (out of 40) of 5 randomly selected students of his psychology class last year. The data in a given row are related to the same student.
student number | marks | mid term test marks | final exam marks |
1 | 14 | 11 | 23 |
2 | 17 | 15 | 40 |
3 | 20 | 20 | 40 |
4 | 10 | 11 | 29 |
5 | 16 | 13 | 35 |
Assuming that there is a linear relationship between the Assignment marks and the Final examination marks, calculate the Pearson’s correlation coefficient. Round your answer to 3 decimal places
Assuming that there is a linear relationship between the Mid-term test marks and the Final examination marks,
i. Derive the least squares prediction line to predict the Final examination marks based on the Mid-term test marks. (9 points)
ii. Calculate the coefficient of determination for the least squares prediction line. (3 points)
iii. Interpret the value of the coefficient of determination in relation to this situation. (2 points)
Out of the two variables ‘Assignment marks’ and ‘Mid-term test marks’, which variable is more suitable to use as the independent variable in a least squares prediction line to predict the Final examination marks? Explain the reason for your answer. (
x | y | (x-x̅)² | (y-ȳ)² | (x-x̅)(y-ȳ) |
14 | 23 | 1.96 | 108.16 | 14.56 |
17 | 40 | 2.56 | 43.56 | 10.56 |
20 | 40 | 21.16 | 43.56 | 30.36 |
10 | 29 | 29.16 | 19.36 | 23.76 |
16 | 35 | 0.36 | 2.56 | 0.96 |
ΣX | ΣY | Σ(x-x̅)² | Σ(y-ȳ)² | Σ(x-x̅)(y-ȳ) | |
total sum | 77.00 | 167.00 | 55.20 | 217.20 | 80.20 |
mean | 15.40 | 33.40 | SSxx | SSyy | SSxy |
correlation coefficient , r = Sxy/√(Sx.Sy) = 0.732
====================
x | y | (x-x̅)² | (y-ȳ)² | (x-x̅)(y-ȳ) |
11 | 23 | 9.00 | 108.16 | 31.20 |
15 | 40 | 1.00 | 43.56 | 6.60 |
20 | 40 | 36.00 | 43.56 | 39.60 |
11 | 29 | 9.00 | 19.36 | 13.20 |
13 | 35 | 1.00 | 2.56 | -1.60 |
ΣX | ΣY | Σ(x-x̅)² | Σ(y-ȳ)² | Σ(x-x̅)(y-ȳ) | |
total sum | 70.00 | 167.00 | 56.00 | 217.20 | 89.00 |
mean | 14.00 | 33.40 | SSxx | SSyy | SSxy |
i)
sample size , n = 5
here, x̅ = Σx / n= 14.000 ,
ȳ = Σy/n = 33.400
SSxx = Σ(x-x̅)² = 56.0000
SSxy= Σ(x-x̅)(y-ȳ) = 89.0
estimated slope , ß1 = SSxy/SSxx = 89.0
/ 56.000 = 1.58929
intercept, ß0 = y̅-ß1* x̄ =
11.15000
so, regression line is Ŷ =
11.15 + 1.59
*x
ii) R² = (Sxy)²/(Sx.Sy) = 0.6512
iii) about 65.12% of variation in observation of Y is explained by variable X
iv) ‘Mid-term test marks’, variable is more suitable to use as the independent variable in a least squares prediction line to predict the Final examination marks
because R² is more in case of ‘Mid-term test marks’,