In: Statistics and Probability
Exposure | Change | Exposure | Change |
---|---|---|---|
0.89 | − 6.1 | 0.33 | 1.8 |
0.05 | 0.0 | 0.11 | 1.9 |
0.35 | 0.2 | 0.36 | 2.2 |
0.37 | − 1.2 | 0.08 | 2.3 |
0.11 | 6.3 | 0.10 | − 3.0 |
0.19 | − 4.6 | 0.06 | 8.4 |
0.10 | 1.1 | 0.34 | − 2.7 |
0.45 | 1.3 | 0.12 | 3.0 |
0.23 | 6.9 | 0.14 | − 2.2 |
0.46 | 1.7 | 0.21 | − 1.4 |
0.10 | − 3.6 | 0.20 | 0.7 |
Regressing subconcussion exposure on white matter change (in Excel go to Data->Data Analysis->Regression and choose these as X and Y variables respectively), we obtain the following regression output:
Regression Statistics | ||||||||
Multiple R | 0.368868063 | |||||||
R Square | 0.136063648 | |||||||
Adjusted R Square | 0.09286683 | |||||||
Standard Error | 3.467466463 | |||||||
Observations | 22 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 1 | 37.87170833 | 37.87170833 | 3.149853515 | 0.091157028 | |||
Residual | 20 | 240.4664735 | 12.02332367 | |||||
Total | 21 | 278.3381818 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 2.272015257 | 1.201555964 | 1.890894244 | 0.073214557 | -0.234386564 | 4.778417078 | -0.234386564 | 4.778417078 |
Exposure | -6.912959936 | 3.895102237 | -1.774782667 | 0.091157028 | -15.03800083 | 1.212080954 | -15.03800083 | 1.212080954 |
Hence, least squares regression line: Change (y) = -6.913 * Exposure(x) + 2.272
Regression std error (from above) = 3.467
The low p-value of 0.0912 ( < 0.1 ) of the Exposure coefficient, indicates that we can say with a confidence level of 90% (which has p-value of 0.1) that the Exposure is a significant predictor of white matter Change. Hence, the null hypothesis of no relationship between subconcussion exposure and white matter change can be rejected in favor of the alternative hypothesis at a confidence level of 10%.
Now, removing the player with Exposure of 0.89, we obtain the regression output as:
Regression Statistics | ||||||||
Multiple R | 0.104318 | |||||||
R Square | 0.010882 | |||||||
Adjusted R Square | -0.04118 | |||||||
Standard Error | 3.47108 | |||||||
Observations | 21 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 1 | 2.518586244 | 2.518586244 | 0.209039146 | 0.652706126 | |||
Residual | 19 | 228.919509 | 12.04839521 | |||||
Total | 20 | 231.4380952 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 1.475869 | 1.451936943 | 1.016482993 | 0.322169391 | -1.563069738 | 4.514808 | -1.56307 | 4.514808 |
Exposure | -2.66665 | 5.832463125 | -0.45720799 | 0.652706126 | -14.87413436 | 9.540837 | -14.8741 | 9.540837 |
The std error stays nearly the same as 3.47 as seen above.
Looking at the high p-value of 0.65 for the slope coefficient, we can say that Exposure is no more a good predictor of Change in white matter. Hence, we lost the precision when we removed the big exposure data point.
Also, the Exposure coefficient has come down from -6.91 to -2.67, which suggests that the Change in white matter doesn't actually change as fast when we remove the outlier data points. Hence the outlier tended to enhance the degree of linear relationship between the two variables much beyond what it should normally be.