In: Math

As part of a study on transportation safety, the U.S. Department of Transportation collected data on the number of fatal accidents per 1000 licenses and the percentage of licensed drivers under the age of 21 in a sample of 42 cities. Data collected over a one-year period follow. These data are contained in the file named “Safety.csv”.

1- Find the sample mean and standard deviation for each variable. Round your answers to the nearest thousandth.

2- Use the function lm() in R to run a simple linear regression model on the data provided. Use the function summary() in R to generate the regression output. Use the function aov() in R to generate the corresponding ANOVA table. You ought to be able to determine which is the dependent variable and which is the independent variable in this SLR model.

**Please copy your R code and
the result and paste them here.**

3- Write down the estimated regression function below and provide a practical interpretation of the coefficient of the independent variable.

4- Please find a 95% confidence interval for the coefficient of the independent variable and provide a practical interpretation of this interval.

5- At the 5% level of significance, is there a significant relationship between the two variables? Why or why not?

6- What is the value of the coefficient of determination for this simple linear regression model? Provide a brief interpretation of this value.

7- Use the information from the ANOVA table to compute the standard error of estimate, a.k,a, residual standard error. This value must match the residual standard error in the regression summary.

8- What is the point estimate of the **expected**
number of fatal accidents per 1000 licenses if there are 10%
drivers under age in a city?

9- Suppose we want to develop a 95% confidence interval for the average number of fatal accidents per 1000 licenses for all the cities with 10% of drivers under age 21. What is the estimate of the standard deviation for this confidence interval?

10-Suppose we want to develop a 95% confidence interval for the average number of fatal accidents per 1000 licenses for all the cities with 10% of drivers under age 21. Compute the t value and the margin of error needed for this confidence interval.

**Please copy your R code and
the result and paste them here.**

11-Provide a 95% confidence interval for the average number of fatal accidents per 1000 licenses for all the cities with 10% of drivers under age 21 and a practical interpretation to this confidence interval.

12- Suppose we want to develop a 95% prediction interval for the number of fatal accidents per 1000 licenses for a city with 10% of drivers under age 21. What is the estimate of the standard deviation for this prediction interval?

13- Suppose we want to develop a 95% prediction interval for the number of fatal accidents per 1000 licenses for a city with 10% of drivers under age 21. Compute the margin of error needed for this prediction interval.

14- Provide a 95% prediction interval for the number of fatal accidents per 1000 licenses for a city with 10% of drivers under age 21 and a practical interpretation to this prediction interval.'

**PS: I do appreciate your help but please do not simply
copy and paste the irrelevant answer**

Safety.csv

Percent Under 21 | Fatal Accidents per 1000 |

13 | 2.962 |

12 | 0.708 |

8 | 0.885 |

12 | 1.652 |

11 | 2.091 |

17 | 2.627 |

18 | 3.83 |

8 | 0.368 |

13 | 1.142 |

8 | 0.645 |

9 | 1.028 |

16 | 2.801 |

12 | 1.405 |

9 | 1.433 |

10 | 0.039 |

9 | 0.338 |

11 | 1.849 |

12 | 2.246 |

14 | 2.855 |

14 | 2.352 |

11 | 1.294 |

17 | 4.1 |

8 | 2.19 |

16 | 3.623 |

15 | 2.623 |

9 | 0.835 |

8 | 0.82 |

14 | 2.89 |

8 | 1.267 |

15 | 3.224 |

10 | 1.014 |

10 | 0.493 |

14 | 1.443 |

18 | 3.614 |

10 | 1.926 |

14 | 1.643 |

16 | 2.943 |

12 | 1.913 |

15 | 2.814 |

13 | 2.634 |

9 | 0.926 |

17 | 3.256 |

First import the given dataset "saftey.csv" in to R.

Then run the following R-code.

attach(saftey);

x = saftey$`Percent Under 21`; #independent variable

y = saftey$`Fatal Accidents per 1000`; #dependent variable

**Ans (1):**

R-code for finding sample means and standard deviations of both variables:

sample_mean_x = mean(x); # sample mean for 'Percent Under
21'

sample_mean_x;

sd_x = sd(x); #standard deviation for 'Percent Under 21'

sd_x;

sample_mean_y = mean(y); # sample mean for 'Fatal Accidents per
1000'

sample_mean_y;

sd_y = sd(y); #standard deviation for 'Fatal Accidents per
1000'

sd_y;

Then the output:

Percent Under 21 (x) | Fatal Accidents per 1000 (y) | |

Sample Mean | 12 | 2 |

Standard Deviation | 3.132 | 1.071 |

**Ans (2):**

The R-code for fitting simple linear regression model and ANOVA Table:

model = lm(y~x); #simple linear regression model

summary(model);

anova = aov(model); #corresponding ANOVA table

summary(anova);

Then the output:

**#Simple linear
regression model:**

Call:

lm(formula = y ~ x)

Residuals:

Min | 1Q | Median | 3Q | Max |

-1.2341 | -0.2644 | 0.0077 | 0.4436 | 1.4909 |

Coefficients:

Estimate | Std. Error | t value | p-value | |

Intercept | -1.5974 | 0.3717 | -4.298 | 0.000107 *** |

x | 0.2871 | 0.0294 | 9.767 | 3.79e-12 *** |

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5894 on 40 degrees of freedom

Multiple R-squared: 0.7046, Adjusted R-squared: 0.6972

F-statistic: 95.4 on 1 and 40 DF, p-value: 3.794e-12

**#ANOVA
Table:**

df | Sum Sq | Mean Sq | F value | p-value | |

x | 1 | 33.13 | 33.13 | 95.4 | 3.79e-12 *** |

Residuals | 40 | 13.89 | 0.35 |

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

**Ans (3):**

From the above output, the estimated regression functions is given as,

**# Interpretation of
coefficient of independent variable:**

Recall: In simple or multiple linear regression, the size of the coefficient for each independent variable gives you the size of the effect that variable is having on your dependent variable, and the sign on the coefficient (positive or negative) gives you the direction of the effect. In regression with a single independent variable, the coefficient tells you how much the dependent variable is expected to increase (if the coefficient is positive) or decrease (if the coefficient is negative) when that independent variable increases by one.

In this problem X is "Percent Under 21" and Y is "Fatal Accidents per 1000" so β1^ is our estimate of the number that Fatal Accidents per 1000 increases for every number per 1000 increase in Percent Under 21.

**Ans (4):**

R-code for finding 95% confidence interval for coeffficient of independent variable:

confint(model, 'x', level = 0.95); #95% confidence interval for coeffficient of independent variable

Then the Output:

**# 95%
Confidence Interval for coefficient of independent
variable:**

Lower 95% | Upper 95% |

0.227654 | 0.346452 |

**# Interpretation of
confidence interval of independent variable:**

We are 95% confidence that it is between 0.2277 and 0.3465 [where by "95%" confidence we mean that if we were to collect new data generated from the same distribution then in 19 out of every 20 experiments we'd get β1^ in this interval].

As part of a study on transportation safety, the U.S. Department
of Transportation collected data on the number of fatal accidents
per 1000 licenses and the percentage of licensed drivers under the
age of 21 in a sample of 42 cities. Data collected over a one-year
period follow. These data are contained in the file named
“Safety.csv”.
1- Find the sample mean and standard deviation for each
variable. Round your answers to the nearest thousandth.
2- Use the function lm()...

A business consultant for the National Transportation
Safety Board (NTSB), collected data on the safety of hybrid
automobiles traveling at 30, 40 and 50 miles per hour. She
randomly assigned the same hybrid model to each condition and
collected data on the pressure applied to the driver’s head during
a crash into a wall at each speed.
What is the independent variable? Dependent
variable?
Is she able to make cause and effect statements about
the cars the head pressure? Explain. ...

1. Seat Belt Use ~ The U.S.
Department of Transportation collected seat belt use data by
stationing observers at randomly selected roadway sites and
recording the number of vehicle occupants who were wearing seat
belts. A random sample of 1000 vehicle occupants in the Northeast
shows that 909 were wearing seat belts, while a random sample of
1000 vehicle occupants in the Midwest showed that 855 were wearing
seat belts.
CREDIT ONLY GIVEN IF WORK IS SHOWN. ANSWERS WITHOUT WRITTEN...

110. A study in transportation safety collected data on 42 North
American cities. From each city, two of the variables recorded were
X = percentage of licensed drivers who are under 21 years of age,
and Y = the number of fatal accidents per year per 1000 licenses.
Below is the output from the data: Parameter Std. Estimate Error T
Statistic p-value Intercept -1.59741 0.371671 -4.29792 0.0001 Slope
0.287053 0.0293898 9.76711 Unknown Correlation Coefficient =
0.839387 R-squared = 70.4571 percent...

Scenario 1: Seat Belt Use ~ The
U.S. Department of Transportation collected seat belt use data by
stationing observers at randomly selected roadway sites and
recording the number of vehicle occupants who were wearing seat
belts. A random sample of 1000 vehicle occupants in the Northeast
shows that 909 were wearing seat belts, while a random sample of
1000 vehicle occupants in the Midwest showed that 855 were wearing
seat belts.
The null hypothesis H_0H 0 believes that the...

The U.S. Department of Transportation, National Highway Traffic
Safety Administration, reported that 77% of all fatally injured
automobile drivers were intoxicated. A random sample of 51 records
of automobile driver fatalities in a certain county showed that 33
involved an intoxicated driver. Do these data indicate that the
population proportion of driver fatalities related to alcohol is
less than 77% in Kit Carson County? Use α = 0.10.
(a) What is the level of significance?
State the null and alternate...

The U.S. Department of Transportation, National Highway Traffic
Safety Administration, reported that 77% of all fatally injured
automobile drivers were intoxicated. A random sample of 52records
of automobile driver fatalities in a certain county showed that 32
involved an intoxicated driver. Do these data indicate that the
population proportion of driver fatalities related to alcohol is
less than 77% in Kit Carson County? Use α = 0.10.
(a) What is the level of significance?
State the null and alternate hypotheses....

The U.S. Department of Transportation, National Highway Traffic
Safety Administration, reported that 77% of all fatally injured
automobile drivers were intoxicated. A random sample of 25 records
of automobile driver fatalities in Kit Carson County, Colorado,
showed that 14 involved an intoxicated driver. Do these data
indicate that the population proportion of driver fatalities
related to alcohol is less than 77% in Kit Carson County? Use
? = 0.01. Solve the problem using both the traditional
method and the P-value...

The U.S. Department of Transportation, National Highway Traffic
Safety Administration, reported that 77% of all fatally injured
automobile drivers were intoxicated. A random sample of 54 records
of automobile driver fatalities in a certain county showed that 36
involved an intoxicated driver. Do these data indicate that the
population proportion of driver fatalities related to alcohol is
less than 77% in Kit Carson County? Use α = 0.10.
1. What is the value of the sample test statistic? (Round your...

The U.S. Department of Transportation, National Highway Traffic
Safety Administration, reported that 77% of all fatally injured
automobile drivers were intoxicated. A random sample of 27 records
of automobile driver fatalities in Kit Carson County, Colorado,
showed that 15 involved an intoxicated driver. Do these data
indicate that the population proportion of driver fatalities
related to alcohol is less than 77% in Kit Carson County? Use
α = 0.01. Solve the problem using both the traditional
method and the P-value...

ADVERTISEMENT

ADVERTISEMENT

Latest Questions

- Standard Costs, Decomposition of Budget Variances, Direct Materials and Direct Labor Haversham Corporation produces dress shirts....
- Suppose you held a diversified portfolio consisting of a $7,500 investment in each of 20 different...
- please answer all Crane Limited purchased a machine on account on April 2, 2018, at an...
- Problem 1. Molecular Genetics A sea urchin mutation results in an unusual positioning of the mitotic...
- Explain, in your own words, what Thomson means when she claims that “Being a good K...
- This class should include .cpp file, .h file and driver.cpp (using the language c++)! Overview of...
- Problem 15-1 On January 5, 2017, Crane Corporation received a charter granting the right to issue...

ADVERTISEMENT