In: Statistics and Probability
The following data represent the number of days absent (x) and the final grade (y) for a sample of college students in a statistics course at a large university.
Number of absences (x): 0 1 2 3 4 5 6 7 8 9
Final grade (y): 89.2 86.4 83.5 81.1 78.2 73.9 64.3 71.8 65.5 66.2
a) Draw a scatter diagram on graph paper. The number of days absent is the explanatory variable and the final grade is the response variable.b) Compute the linear correlation coefficient, r. Interpret your answer.c) Compare │r│ to rcritical to determine if a linear relation exists between x and y. d) Find an equation, the least squares regression line, that best fits your data.e) Accurately draw the least squares regression line on your scatter diagram from part (a). Be sure to identify your plotted points.f) Interpret the slope and the y-intercept.g) Compute the final grade using your least squares regression equation for a student who misses 5 classes. Now find the residual.h) Use your graph to predict the final grade of a student who misses 10 classes.i) Determine and interpret the coefficient of determination, R2.
X | Y | XY | X² | Y² |
0 | 89.2 | 0 | 0 | 7956.64 |
1 | 86.4 | 86.4 | 1 | 7464.96 |
2 | 83.5 | 167 | 4 | 6972.25 |
3 | 81.1 | 243.3 | 9 | 6577.21 |
4 | 78.2 | 312.8 | 16 | 6115.24 |
5 | 73.9 | 369.5 | 25 | 5461.21 |
6 | 64.3 | 385.8 | 36 | 4134.49 |
7 | 71.8 | 502.6 | 49 | 5155.24 |
8 | 65.5 | 524 | 64 | 4290.25 |
9 | 66.2 | 595.8 | 81 | 4382.44 |
Ʃx = | 45 |
Ʃy = | 760.1 |
Ʃxy = | 3187.2 |
Ʃx² = | 285 |
Ʃy² = | 58509.93 |
Sample size, n = | 10 |
x̅ = Ʃx/n = 45/10 = | 4.5 |
y̅ = Ʃy/n = 760.1/10 = | 76.01 |
SSxx = Ʃx² - (Ʃx)²/n = 285 - (45)²/10 = | 82.5 |
SSyy = Ʃy² - (Ʃy)²/n = 58509.93 - (760.1)²/10 = | 734.729 |
SSxy = Ʃxy - (Ʃx)(Ʃy)/n = 3187.2 - (45)(760.1)/10 = | -233.25 |
a) Scatter plot:
b) Correlation coefficient, r = SSxy/√(SSxx*SSyy) = -233.25/√(82.5*734.729) = -0.9474
c) df = n-2 = 8
critical r = 0.632
As |r| > 0.632, we reject the null hypothesis.
There is a correlation between x and y.
d)
Slope, b = SSxy/SSxx = -233.25/82.5 = -2.82727
y-intercept, a = y̅ -b* x̅ = 76.01 - (-2.82727)*4.5 = 88.73273
Regression equation :
ŷ = 88.7327 + (-2.8273) x
e)
f) Slope :
As the number of classes misses increases the value of final grade decreases by -2.8273.
g) Predicted value of y at x = 5
ŷ = 88.7327 + (-2.8273) * 5 = 74.5964
Residual = y - ŷ = 73.9 - 74.5964 = -0.6964
h) Predicted value of y at x = 10
ŷ = 88.7327 + (-2.8273) * 10 = 60.46
i) Coefficient of determination, r² = (SSxy)²/(SSxx*SSyy) = (-233.25)²/(82.5*734.729) = 0.8976