In: Statistics and Probability
This is the final grade and number of absences for a set of students. Regress grade on absences. Use a 95% confidence level. Give the equation of estimation. Interpret the equation. According to the regression, now much does a tardy (= 1/2 an absence) change your grade? Evaluate the model. What evaluation criterion did you use? Could this be a case of reverse causality? If so, give an example of how the causation could run in the opposite direction.
Student | Grade | Absences | |
---|---|---|---|
1 | 57 | 6 | |
2 | 87.2 | 0 | |
3 | 87.6 | 2.5 | |
4 | 66.2 | 6 | |
5 | 94.2 | 1 | |
6 | 96.1 | 0 | |
7 | 74.8 | 2.5 | |
8 | 86.6 | 0 | |
9 | 74.6 | 4.5 | |
10 | 90.7 | 1 | |
11 | 85.5 | 1 | |
12 | 83.4 | 2.5 | |
13 | 92.8 | 1 | |
14 | 76.7 | 4 | |
15 | 78.9 | 1.5 | |
16 | 84.6 | 0 | |
17 | 84.7 | 1.5 | |
18 | 86.3 | 2.5 | |
19 | 95.7 | 0 | |
20 | 95.3 | 2 | |
21 | 87.9 | 0 | |
22 | 84.7 | 0 | |
23 | 81.6 | 2 | |
24 | 70.5 | 5.5 | |
25 | 76.7 | 1 | |
26 | 90.1 | 0 | |
27 | 95.1 | 1 | |
28 | 98.2 | 0 | |
29 | 66.5 | 4 | |
30 | 87.1 | 0 | |
31 | 69.8 | 4.5 | |
32 | 77.2 | 2 | |
33 | 81 | 0.5 | |
34 | 76.6 | 0 | |
35 | 84.2 | 0 | |
36 | 79.1 | 1.5 | |
37 | 84.5 | 3.5 | |
38 | 71.4 | 2.5 | |
39 | 68.3 | 5 | |
40 | 92.2 | 0 | |
41 | 69.2 | 5 | |
Regressing Grade on Absences in Excel (go to Data tab -> Data Analysis -> Regression, and choose Grade column as Y-values and Absences column as X-values, Confidence level as 95%) gives us the following output:
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 89.7151986 | 1.36692369 | 65.63292396 | 1.53875E-41 | 86.95033447 | 92.48006274 |
Absences | -3.968040552 | 0.511113307 | -7.763524244 | 1.96325E-09 | -5.001864797 | -2.934216306 |
Equation of estimation obtained using the coefficients above:
Grade = -3.968 * Absences + 89.715
The equation suggests that when there are no absences (Absences=0), the projected Grade is 89.715.
However, the Grade comes down by 3.968 points for every Absence day. Hence, Grade and Absence are negatively correlated.
A tardy (1/2 an absence) brings down the grade by 3.968 * 0.5 = 1.984 points
Looking at the p-value of 1.96325E-09, which is << 0.01, we can say that absences are a very significant predictor of Grades at 99% confidence level. Hence, the model is a very nice estimate of the Grades in terms of Absences.
The model was evaluated as above using the p-value of
slope (absences) coefficient, which is much less than the
critical p-value, hence we can reject the null hypothesis that the
Grades and Absences are uncorrelated ( = 0), at 99%
confidence level, in favour of the alternative hypothesis
(
!= 0).
This could as well be a case of Reverse Causality, where lower Grades lead to more Absences (for example, because of lack of interest in such students in attending classes, owing to lower grades in earlier exams).