Question

In: Math

As part of a study on transportation safety, the U.S. Department of Transportation collected data on...

As part of a study on transportation safety, the U.S. Department of Transportation collected data on the number of fatal accidents per 1000 licenses and the percentage of licensed drivers under the age of 21 in a sample of 42 cities. Data collected over a one-year period follow. These data are contained in the file named “Safety.csv”.

1- Find the sample mean and standard deviation for each variable. Round your answers to the nearest thousandth.

2- Use the function lm() in R to run a simple linear regression model on the data provided. Use the function summary() in R to generate the regression output. Use the function aov() in R to generate the corresponding ANOVA table. You ought to be able to determine which is the dependent variable and which is the independent variable in this SLR model.

Please copy your R code and the result and paste them here.

3- Write down the estimated regression function below and provide a practical interpretation of the coefficient of the independent variable.

4- Please find a 95% confidence interval for the coefficient of the independent variable and provide a practical interpretation of this interval.

5- At the 5% level of significance, is there a significant relationship between the two variables? Why or why not?

6- What is the value of the coefficient of determination for this simple linear regression model? Provide a brief interpretation of this value.

7- Use the information from the ANOVA table to compute the standard error of estimate, a.k,a, residual standard error. This value must match the residual standard error in the regression summary.

8- What is the point estimate of the expected number of fatal accidents per 1000 licenses if there are 10% drivers under age in a city?

9- Suppose we want to develop a 95% confidence interval for the average number of fatal accidents per 1000 licenses for all the cities with 10% of drivers under age 21. What is the estimate of the standard deviation for this confidence interval?

10-Suppose we want to develop a 95% confidence interval for the average number of fatal accidents per 1000 licenses for all the cities with 10% of drivers under age 21. Compute the t value and the margin of error needed for this confidence interval.

Please copy your R code and the result and paste them here.

11-Provide a 95% confidence interval for the average number of fatal accidents per 1000 licenses for all the cities with 10% of drivers under age 21 and a practical interpretation to this confidence interval.

12- Suppose we want to develop a 95% prediction interval for the number of fatal accidents per 1000 licenses for a city with 10% of drivers under age 21. What is the estimate of the standard deviation for this prediction interval?

13- Suppose we want to develop a 95% prediction interval for the number of fatal accidents per 1000 licenses for a city with 10% of drivers under age 21. Compute the margin of error needed for this prediction interval.

14- Provide a 95% prediction interval for the number of fatal accidents per 1000 licenses for a city with 10% of drivers under age 21 and a practical interpretation to this prediction interval.

Percent Under 21 Fatal Accidents per 1000
13 2.962
12 0.708
8 0.885
12 1.652
11 2.091
17 2.627
18 3.83
8 0.368
13 1.142
8 0.645
9 1.028
16 2.801
12 1.405
9 1.433
10 0.039
9 0.338
11 1.849
12 2.246
14 2.855
14 2.352
11 1.294
17 4.1
8 2.19
16 3.623
15 2.623
9 0.835
8 0.82
14 2.89
8 1.267
15 3.224
10 1.014
10 0.493
14 1.443
18 3.614
10 1.926
14 1.643
16 2.943
12 1.913
15 2.814
13 2.634
9 0.926
17 3.256

Ps: I do appreciate your help But please do not simply copy and paste irrelevant answer, Thanks

Solutions

Expert Solution

The R code for Q 1-4

data=as.data.frame(read.csv("data1.csv",header=T))

round(mean(data[,1]),3)

# 12.262

round(mean(data[,2]),3)

# 1.922

round(sd(data[,1]),3)

# 3.132

round(sd(data[,2]),3)

# 1.071

names(data)

model1=lm(Accidents~Age,data=data)

summary(model1)

#Call:

# lm(formula = Accidents ~ Age, data = data)

#Residuals:

# Min 1Q Median 3Q Max

#-1.23412 -0.26441 0.00772 0.44362 1.49099

#Coefficients:

# Estimate Std. Error t value Pr(>|t|)   

#(Intercept) -1.59741 0.37167 -4.298 0.000107 ***

# Age 0.28705 0.02939 9.767 3.79e-12 ***

# ---

# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

#Residual standard error: 0.5894 on 40 degrees of freedom

#Multiple R-squared: 0.7046, Adjusted R-squared: 0.6972

#F-statistic: 95.4 on 1 and 40 DF, p-value: 3.794e-12

aov(model1)

#Call:

# aov(formula = model1)

#Terms:

# Age Residuals

#Sum of Squares 33.13442 13.89335

#Deg. of Freedom 1 40

#Residual standard error: 0.5893503

#Estimated effects may be unbalanced

confint(model1,"Age",level = 0.95)

# 2.5 % 97.5 %

#Age 0.2276542 0.3464521

The estimated regression equation is y = -1.59741 + 0.28705 * (% under 21). The coefficient .28705 means that if the percentage of under 21 licensed drivers increases by 1 then the predicetd value of fatal accidents per 1000 increases by .28705.

The confidence interval found above does not contain 0, hence we can be 95% confident that the age variable is significant


Related Solutions

As part of a study on transportation safety, the U.S. Department of Transportation collected data on...
As part of a study on transportation safety, the U.S. Department of Transportation collected data on the number of fatal accidents per 1000 licenses and the percentage of licensed drivers under the age of 21 in a sample of 42 cities. Data collected over a one-year period follow. These data are contained in the file named “Safety.csv”. 1- Find the sample mean and standard deviation for each variable. Round your answers to the nearest thousandth. 2- Use the function lm()...
A business consultant for the National Transportation Safety Board (NTSB), collected data on the safety of...
A business consultant for the National Transportation Safety Board (NTSB), collected data on the safety of hybrid automobiles traveling at 30, 40 and 50 miles per hour.  She randomly assigned the same hybrid model to each condition and collected data on the pressure applied to the driver’s head during a crash into a wall at each speed. What is the independent variable? Dependent variable? Is she able to make cause and effect statements about the cars the head pressure? Explain. ...
1. Seat Belt Use ~   The U.S. Department of Transportation collected seat belt use data by...
1. Seat Belt Use ~   The U.S. Department of Transportation collected seat belt use data by stationing observers at randomly selected roadway sites and recording the number of vehicle occupants who were wearing seat belts. A random sample of 1000 vehicle occupants in the Northeast shows that 909 were wearing seat belts, while a random sample of 1000 vehicle occupants in the Midwest showed that 855 were wearing seat belts. CREDIT ONLY GIVEN IF WORK IS SHOWN. ANSWERS WITHOUT WRITTEN...
110. A study in transportation safety collected data on 42 North American cities. From each city,...
110. A study in transportation safety collected data on 42 North American cities. From each city, two of the variables recorded were X = percentage of licensed drivers who are under 21 years of age, and Y = the number of fatal accidents per year per 1000 licenses. Below is the output from the data: Parameter Std. Estimate Error T Statistic p-value Intercept -1.59741 0.371671 -4.29792 0.0001 Slope 0.287053 0.0293898 9.76711 Unknown Correlation Coefficient = 0.839387 R-squared = 70.4571 percent...
Scenario 1: Seat Belt Use ~   The U.S. Department of Transportation collected seat belt use data...
Scenario 1: Seat Belt Use ~   The U.S. Department of Transportation collected seat belt use data by stationing observers at randomly selected roadway sites and recording the number of vehicle occupants who were wearing seat belts. A random sample of 1000 vehicle occupants in the Northeast shows that 909 were wearing seat belts, while a random sample of 1000 vehicle occupants in the Midwest showed that 855 were wearing seat belts. The null hypothesis H_0H 0 ​ believes that the...
The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally...
The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally injured automobile drivers were intoxicated. A random sample of 51 records of automobile driver fatalities in a certain county showed that 33 involved an intoxicated driver. Do these data indicate that the population proportion of driver fatalities related to alcohol is less than 77% in Kit Carson County? Use α = 0.10. (a) What is the level of significance? State the null and alternate...
The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally...
The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally injured automobile drivers were intoxicated. A random sample of 52records of automobile driver fatalities in a certain county showed that 32 involved an intoxicated driver. Do these data indicate that the population proportion of driver fatalities related to alcohol is less than 77% in Kit Carson County? Use α = 0.10. (a) What is the level of significance? State the null and alternate hypotheses....
The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally...
The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally injured automobile drivers were intoxicated. A random sample of 25 records of automobile driver fatalities in Kit Carson County, Colorado, showed that 14 involved an intoxicated driver. Do these data indicate that the population proportion of driver fatalities related to alcohol is less than 77% in Kit Carson County? Use ? = 0.01. Solve the problem using both the traditional method and the P-value...
The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally...
The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally injured automobile drivers were intoxicated. A random sample of 54 records of automobile driver fatalities in a certain county showed that 36 involved an intoxicated driver. Do these data indicate that the population proportion of driver fatalities related to alcohol is less than 77% in Kit Carson County? Use α = 0.10. 1. What is the value of the sample test statistic? (Round your...
The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally...
The U.S. Department of Transportation, National Highway Traffic Safety Administration, reported that 77% of all fatally injured automobile drivers were intoxicated. A random sample of 27 records of automobile driver fatalities in Kit Carson County, Colorado, showed that 15 involved an intoxicated driver. Do these data indicate that the population proportion of driver fatalities related to alcohol is less than 77% in Kit Carson County? Use α = 0.01. Solve the problem using both the traditional method and the P-value...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT