Question

In: Statistics and Probability

A sample of 31 people took a written driver’s license exam. Two variables were measured on...

A sample of 31 people took a written driver’s license exam. Two variables were measured on them: The result of the exam (0 = fail, 1 = pass), and how much time (in hours) the person studied for the exam. Using the data, fit an appropriate regression model to determine whether time spent studying is a useful predictor of the chance of passing the exam. Formally assess the overall fit of the model. Formally assess whether time spent studying is a useful predictor (as always, providing numerical justification (test statistic and P-value) for your conclusion). Carefully interpret what the estimated model tells you about how the chance of passing the exam changes as the time spent studying changes. A prospective examinee named Matthew spent 3.0 hours studying for his written exam. Estimate (with a point estimate and with a 90% interval) his probability of passing the exam. Based on this estimate, predict whether he will pass or fail.

SAS code:

DATA three;
INPUT result hours;
/* result=0 is fail; result=1 is pass */
cards;
0 0.8
0 1.6
0 1.4
1 2.3
1 1.4
1 3.2
0 0.3
1 1.7
0 1.8
1 2.7
0 0.6
0 1.1
1 2.1
1 2.8
1 3.4
1 3.6
0 1.7
1 0.9
1 2.2
1 3.1
0 1.4
1 1.9
0 0.4
0 1.6
1 2.5
1 3.2
1 1.7
1 1.9
0 2.2
0 1.3
1 1.5
;
run;

Solutions

Expert Solution

Here binary logistic regression model is to be applied.

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.625077
R Square 0.390721
Adjusted R Square 0.369711
Standard Error 0.696174
Observations 31
ANOVA
df SS MS F Significance F
Regression 1 9.013302 9.013302 18.59724 0.00017
Residual 29 14.05509 0.484658
Total 30 23.06839
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 1.246154 0.193084 6.45395 4.62E-07 0.851253 1.641055 0.851253 1.641055
X Variable 1 1.092735 0.253391 4.312451 0.00017 0.574493 1.610977 0.574493 1.610977

The log value stats are shown below

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.600745
R Square 0.360895
Adjusted R Square 0.338857
Standard Error 0.210159
Observations 31
ANOVA
df SS MS F Significance F
Regression 1 0.723275 0.723275 16.37595 0.000352
Residual 29 1.28084 0.044167
Total 30 2.004115
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 0.034185 0.058288 0.586479 0.562091 -0.08503 0.153396 -0.08503 0.153396
X Variable 1 0.309546 0.076493 4.046721 0.000352 0.1531 0.465991 0.1531 0.465991

for upper 90%

Upper 90.0%
Intercept 0.133223
X Variable 1 0.439517

So we have 1.7 hrs for Matthew as the minimum, at 90% he should pass at 3 hrs


Related Solutions

Driver’s License test. Write a program that grades the written portion of the driver’s license test....
Driver’s License test. Write a program that grades the written portion of the driver’s license test. The test has 20 multiple choice questions. Here are the correct answers: ( use array to store ) 1.B 2.D 3. A 4. A 5. C 6. A 7. B 8. A 9. C 10. D 11.B 12. C 13. D 14. A 15. D 16. C 17. C 18. B 19. D 20. A A student must correctly answer ( use array to...
The driver’s license office DMV has asked you to write a program that grades the written...
The driver’s license office DMV has asked you to write a program that grades the written portion of the driver’s license questions. The questions has 20 multiple-choice questions. Here are the correct answers: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 B D A A A C B C D A D C C D D B A B C D Your program should store these correct answers...
Fewer young people are driving. In 1983, 87% of 19-year-olds had a driver’s license. Twentyfive years...
Fewer young people are driving. In 1983, 87% of 19-year-olds had a driver’s license. Twentyfive years later that percentage had dropped to 75%. Suppose the results are based on a random sample of 1200 19-year-olds in 1983 and again in 2008, and we are interested in estimating the population proportion of 19-year-old drivers in 2008. a. At 95% confidence, what is the margin of error? b. Develop a 95% confidence interval for the proportion of 19-year-old drivers in 2008.
1. You have data on two variables: exam score and student preparation time for the exam...
1. You have data on two variables: exam score and student preparation time for the exam in hours. You collect data from 6 students. (8 Points) Student Exam Prep Time (hours) Exam Score Student Exam Prep Time (hours) Exam Score 1 2 61 2 6 92 3 9 95 4 4 80 5 5 78 6 3 68 a. What will be the dependent variable and the independent variable? b. Construct a Scatter Plot. c. What type of relationship do...
I took a sample of 20 students at St. Kate’s and measured their heights. I computed...
I took a sample of 20 students at St. Kate’s and measured their heights. I computed the sample mean (x bar) and want to construct a confidence interval. I want to create an approximate 95% confidence interval. What is this? I took a sample of 20 students and recorded their height. X bar (sample mean) was 63 inches. The population standard deviation for height is 4 inches. What is the approximate 95% confidence interval? A 99% confidence interval will always...
1. Rangers took a random sample of live aspen trees and measured the base circumference of...
1. Rangers took a random sample of live aspen trees and measured the base circumference of each tree. The sample had 50 trees with a mean circumference of 15.57 inches. a. Find a 95% confidence interval for the population mean base circumference of all aspen trees in Roosevelt National Forest. Assume that the population standard deviation is 4.78 inches. You must show work for full credit. b. Would the 90% confidence interval for the population mean base circumference of aspen...
1. What demographic variables were measured at the nominal level of measurement in the Oh et...
1. What demographic variables were measured at the nominal level of measurement in the Oh et al. (2014) study? Provide a rationale for your answer. 2. What statistics were calculated to describe body mass index (BMI) in this study? Were these appropriate? Provide a rationale for your answer. 3. Were the distributions of scores for BMI similar for the intervention and control groups? Provide a rationale for your answer. 4. Was there a signifi cant difference in BMI between the...
levels of serotonin were measured in a sample of 25 dogs in their puppy years. The...
levels of serotonin were measured in a sample of 25 dogs in their puppy years. The sample mean and the sample standard deviation were 73 and 16, respectively. Can it be concluded from the data that the population mean level is greater than 70? Use α = 0.05 level of significance. a what is the value for the best point estimate for the circulating levels of estrogen in the population? b What assumption(s) must be made? c State the hypotheses...
I measured the height of a sample of marigolds by the Union. The values were 7,...
I measured the height of a sample of marigolds by the Union. The values were 7, 10, 12, 9, 10, 12, 10, 9, 10, 11, 8, 12, 10, 9, 11, 10, 9, 10, and 11 inches. I then walked downtown and saw some marigolds growing on Broadway, so I measured a sample of them. The values were 6, 9, 13, 10, 12, 10, 8, 11, 7, 14, 10, 12, 8, 7, 14, 9, 8, 13. Calculate the mean, the variance,...
6) I took a random sample of people from the 2000 U.S. Census and recorded the...
6) I took a random sample of people from the 2000 U.S. Census and recorded the incomes. The sample size was n = 30 (with replacement). The average of the sample was $23606 and the standard deviation of the sample was $24757. Answer these questions using the statistics that I found from my sample. a) Find a 95% confidence interval for the mean income of the population. b) What assumptions must you make in order for the 95% confidence level...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT