In: Statistics and Probability
A sample of 31 people took a written driver’s license exam. Two variables were measured on them: The result of the exam (0 = fail, 1 = pass), and how much time (in hours) the person studied for the exam. Using the data, fit an appropriate regression model to determine whether time spent studying is a useful predictor of the chance of passing the exam. Formally assess the overall fit of the model. Formally assess whether time spent studying is a useful predictor (as always, providing numerical justification (test statistic and P-value) for your conclusion). Carefully interpret what the estimated model tells you about how the chance of passing the exam changes as the time spent studying changes. A prospective examinee named Matthew spent 3.0 hours studying for his written exam. Estimate (with a point estimate and with a 90% interval) his probability of passing the exam. Based on this estimate, predict whether he will pass or fail.
SAS code:
DATA three; INPUT result hours; /* result=0 is fail; result=1 is pass */ cards; 0 0.8 0 1.6 0 1.4 1 2.3 1 1.4 1 3.2 0 0.3 1 1.7 0 1.8 1 2.7 0 0.6 0 1.1 1 2.1 1 2.8 1 3.4 1 3.6 0 1.7 1 0.9 1 2.2 1 3.1 0 1.4 1 1.9 0 0.4 0 1.6 1 2.5 1 3.2 1 1.7 1 1.9 0 2.2 0 1.3 1 1.5 ; run;
Here binary logistic regression model is to be applied.
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.625077 | |||||||
R Square | 0.390721 | |||||||
Adjusted R Square | 0.369711 | |||||||
Standard Error | 0.696174 | |||||||
Observations | 31 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 1 | 9.013302 | 9.013302 | 18.59724 | 0.00017 | |||
Residual | 29 | 14.05509 | 0.484658 | |||||
Total | 30 | 23.06839 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 1.246154 | 0.193084 | 6.45395 | 4.62E-07 | 0.851253 | 1.641055 | 0.851253 | 1.641055 |
X Variable 1 | 1.092735 | 0.253391 | 4.312451 | 0.00017 | 0.574493 | 1.610977 | 0.574493 | 1.610977 |
The log value stats are shown below
SUMMARY OUTPUT | ||||||||
Regression Statistics | ||||||||
Multiple R | 0.600745 | |||||||
R Square | 0.360895 | |||||||
Adjusted R Square | 0.338857 | |||||||
Standard Error | 0.210159 | |||||||
Observations | 31 | |||||||
ANOVA | ||||||||
df | SS | MS | F | Significance F | ||||
Regression | 1 | 0.723275 | 0.723275 | 16.37595 | 0.000352 | |||
Residual | 29 | 1.28084 | 0.044167 | |||||
Total | 30 | 2.004115 | ||||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | Lower 95.0% | Upper 95.0% | |
Intercept | 0.034185 | 0.058288 | 0.586479 | 0.562091 | -0.08503 | 0.153396 | -0.08503 | 0.153396 |
X Variable 1 | 0.309546 | 0.076493 | 4.046721 | 0.000352 | 0.1531 | 0.465991 | 0.1531 | 0.465991 |
for upper 90%
Upper 90.0% | |
Intercept | 0.133223 |
X Variable 1 | 0.439517 |
So we have 1.7 hrs for Matthew as the minimum, at 90% he should pass at 3 hrs