Question

In: Statistics and Probability

In a study of the role of young drivers in automobile accidents, data on percentage of...

In a study of the role of young drivers in automobile accidents, data on percentage of licensed drivers under the age of 21 and the number of fatal accidents per 1000 licenses were determined for 32 cities. The data are stored in Table B. The first column contains a number as the city code, the second column contains the percentage of drivers who are under 21, and the third column contains the number of fatal accidents per 1000 drivers. The primary interest is whether or not the number of fatal accidents is dependent upon the proportion of licensed drivers that are under 21.

City Number % of drivers under 21 # of fatal accidents per 1000 drivers
1 5 0
2 5 0
3 5 0
4 13 2.029
5 17 5.12
6 7 0.468
7 13 1.463
8 14 3.412
9 8 2.104
10 15 3.146
11 11 2.081
12 14 3.612
13 11 2.117
14 12 2.758
15 9 1.819
16 8 1.483
17 14 3.211
18 10 1.157
19 10 0.871
20 9 1.34
21 15 2.751
22 6 0
23 9 0.712
24 12 1.93
25 17 3.899
26 4 0
27 14 2.992
28 9 0.577
29 8 1.819
30 11 2.218
31 9 1.075
32 15 2.105

Regression analysis, where one variable depends on another, can be used to predict levels of a dependent variable for specified levels of an independent variable. Use the EXCEL REGRESSION command to calculate the intercept and slope of the least‑squares line, as well as the analysis of variance associated with that line. Fill in the following table and use the results to answer the next few questions. Carefully choose your independent and dependent variables and input them correctly using EXCEL’s regression command. In this example, the percentage of drivers under the age of 21 affects the number of Fatals/1000 licenses.

The regression equation (least‑squares line) is

            Fatals/1000 licenses =                   +                   % under 21

                                                 (intercept)       (slope)

Analysis of variance

Source                         DF          SS                 MS          F               P

Regression                   1         ________   _______  ________   _______                      

Residual (Error)           30        ________   _______

10. What is the estimated increase in number of fatal accidents per 1000 licenses due to a one percent increase in the percentage of drivers under 21 (i.e. the slope)?

11. What is the standard deviation of the estimated slope?

12. What is the estimated number of fatal accidents per 1000 licenses if there were no drivers under the age of 21 (i.e. the y intercept)?

13. What percentage of the variation in accident fatalities can be explained by the linear relationship with drivers under 21 (i.e. 100 ´ the unadjusted coefficient of determination)?

Solutions

Expert Solution

x y (x-x̅)² (y-ȳ)² (x-x̅)(y-ȳ)
5 0 31.29 3.32 10.19
5 0 31.29 3.32 10.19
5 0 31.29 3.32 10.19
13 2.029 5.79 0.04 0.50
17 5.12 41.04 10.88 21.13
7 0.468 12.92 1.83 4.86
13 1.463 5.79 0.12810 -0.8612
14 3.412 11.60 2.53158 5.420
8 2.104 6.73 0.08 -0.73
15 3.146 19.42 1.76 5.84
11 2.081 0.17 0.07 0.11
14 3.612 11.60 3.21 6.10
11 2.117 0.17 0.09 0.12
12 2.758 1.98 0.88 1.32
9 1.819 2.54 0.00 0.00
8 1.483 6.73 0.11 0.88
14 3.211 11.60 1.93 4.74
10 1.157 0.35 0.44 0.39
10 0.871 0.35 0.90 0.56
9 1.34 2.54 0.23 0.77
15 2.751 19.42 0.87 4.10
6 0 21.10 3.32 8.36
9 0.712 2.54 1.23 1.77
12 1.93 1.978 0.012 0.153
17 3.899 41.040 4.318 13.313
4 0 43.478 3.316 12.007
14 2.992 11.603 1.371 3.989
9 0.577 2.540 1.547 1.982
8 1.819 6.728 0.000 0.005
11 2.218 0.165 0.158 0.161
9 1.075 2.540 0.556 1.189
15 2.105 19.415 0.081 1.252
ΣX ΣY Σ(x-x̅)² Σ(y-ȳ)² Σ(x-x̅)(y-ȳ)
total sum 339.00 58.27 407.72 51.83 129.98
mean 10.59 1.82 SSxx SSyy SSxy

sample size ,   n =   32          
here, x̅ = Σx / n=   10.594   ,     ȳ = Σy/n =   1.821  
                  
SSxx =    Σ(x-x̅)² =    407.7188          
SSxy=   Σ(x-x̅)(y-ȳ) =   130.0          
                  
estimated slope , ß1 = SSxy/SSxx =   130.0   /   407.719   =   0.31881
                  
intercept,   ß0 = y̅-ß1* x̄ =   -1.55643          
                  
so, regression line is   Ŷ =   -1.5564   +   0.3188   *x
==================

Fatals/1000 licenses =           -1.5564 +           0.3188 % under 21

Anova table
variation SS df MS F-stat p-value
regression 41.439 1 41.4392 119.6079 0.0000
error, 10.394 30 0.3465
total 51.833 31

10)

answer: 0.3188

11) estimated std error of slope =Se(ß1) = Se/√Sxx =    0.589   /√   407.72   =   0.0292

12) -1.5564

13) R² =    (Sxy)²/(Sx.Sy) =    0.799 or 79.9%


Related Solutions

In a study of the role of young drivers in automobile accidents, data on the percentage...
In a study of the role of young drivers in automobile accidents, data on the percentage of licensed drivers under the age of 21 and the number of fatal accidents per 1000 licenses were determined for 32 cities. The first column contains a number as city code, the second column contains the percentage of drivers who are under 21, and the third column contains the number of fatal accidents is dependent upon the proportion of licensed drivers that are under...
A study of car accidents and drivers who use cellular phones provided the following sample data....
A study of car accidents and drivers who use cellular phones provided the following sample data.                               Had accident            Had no accident                         Cellular phone user                     25                                 280                         Not cellular phone user                48                               412 a) What is the size of the table? b) At α = 0.01, test the claim that the occurrence of accidents is independent of the use of cellular phones.
A study of car accidents and drivers who use cell phones collects the following sample data....
A study of car accidents and drivers who use cell phones collects the following sample data. had accident in the last year had no accidents in the last year cellular phone user 23 282 non cellular phone user 46 407 Formulate the hypotheses Determine the expected frequencies of those who had accidents in the last year to use for the chi-square test of independence. Test the hypotheses Please show your work. Thanks! and if possible can you explain how to...
An insurance company is interested in estimating the percentage of auto accidents that involve teenage drivers....
An insurance company is interested in estimating the percentage of auto accidents that involve teenage drivers. Suppose the percentage of auto accidents that involved teenage drivers last year was 15%. The company wants to know if the percentage has changed this year. They check the records of 600 accidents selected at random from this year and note that teenagers were at the wheel in 60 of them. (a) Create a 90% confidence interval for the percentage of all auto accidents...
The data below represent the ages of drivers and the number of accidents reported for                 ...
The data below represent the ages of drivers and the number of accidents reported for                  each age group in Pennsylvania for a selected year. Use an excel spreadsheet to                  create a scatter plot and calculate the sample correlation coefficient. At the α = 0.10                  level, test for a significant correlation between age and number of accidents. Age, x 16 17 18 19 20 21 Number of accidents, y 6605 8932 8506 7349 6458 5974
A large car insurance company is conducting a study of accidents for male and female drivers.
A large car insurance company is conducting a study of accidents for male and female drivers. They want to know if on average male drivers (who tend to be thought of as more aggressive drivers) have more accidents than female drivers. Data on the number of accidents in the past 5 years is collected for randomly selected drivers who are insured by this company. An analysis of the results produced the following summary statistics.Group Statistics                                        NMeanStd DeviationStd Error MeanMale322.191.710.3026Female311.231.380.2482Using α=0.01, do...
Using the data from 15 automobile accidents, the correlation coefficient between the combined speeds of the...
Using the data from 15 automobile accidents, the correlation coefficient between the combined speeds of the cars (x) in an accident and the amount of damage done (y) is 0.7831. The regression equation for the two variables is y = 801.518 + 162.845x. a. Is this a significant correlation? b. If the answer to last part is YES, then predict the amount of damage done in an accident in which the combined speeds of the car involved was 100 mph.
A recent study on serious car accidents involving male drivers aged 18-24 found that 50% of...
A recent study on serious car accidents involving male drivers aged 18-24 found that 50% of the accidents were caused by excessive speed, 30% of the accidents caused by mobile phone use and 10% of the accidents involved the driver speeding and using his mobile phone. Hint: Use a Venn diagram to help answer the following questions about the serious car accidents What is the probability that an accident was caused by speeding alone? What is the probability that neither...
write an essay on how to avoid automobile accidents
write an essay on how to avoid automobile accidents
For drivers in the 20-24 age bucket, there is a 34% rate of car accidents in...
For drivers in the 20-24 age bucket, there is a 34% rate of car accidents in one year. An insurance investigation finds that in a group of 500 randomly selected drivers aged 20-24 living in New York, 42% had accidents in the last year. a. Assuming the same 34% rate applies, find the mean and standard deviation for the number of people in groups of 500 that can be expected to have accidents. b. How many drivers in the New...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT