In: Statistics and Probability
Q6). We are interested in exploring the relationship between the weight of a vehicle and its fuel efficiency (gasoline mileage). The data in the table show the weights, in pounds, and fuel efficiency, measured in miles per gallon, for a sample of 12 vehicles.
Weight | Fuel Efficiency |
---|---|
2710 | 24 |
2550 | 24 |
2680 | 29 |
2720 | 38 |
3000 | 25 |
3410 | 22 |
3640 | 21 |
3700 | 27 |
3880 | 21 |
3900 | 19 |
4060 | 21 |
4710 | 16 |
Part (a) Graph a scatterplot of the data.
Part (b) Find the correlation coefficient.
Determine if the correlation coefficient is significant.
( )
Yes, it is significant.
No, it is not significant.
Part (c) Find the equation of the best fit line. (Round your answers to four decimal places.)
? = x +
Part (d) Write the sentence that interprets the meaning of the slope of the line in the context of the data.
For every one car added to the data set, the average weight will change by the value of the slope.
For every one car added to the data set, the average fuel efficiency will change by the value of the slope.
For every one pound increase in weight, the fuel efficiency changes by the value of the slope.
For every one mile per gallon increase in fuel efficiency, the weight changes by the value of the slope.
Part (e) What percent of the variation in fuel efficiency is explained by the variation in the weight of the vehicles, using the regression line? (Round your answer to the nearest whole number.)
( )%
Part (f) Graph the best fit line on your scatterd plot.
Part (g) For the vehicle that weighs 3000 pounds, find the residual (y ? ?). (Round your answer to two decimal places.)
( )
Does the value predicted by the line underestimate or overestimate
the observed data value?
underestimate
overestimate
Part (h) Identify any outliers, using either the graphical or numerical procedure demonstrated in the textbook. (Select all that apply.)
(4710, 16)
no outliers
(2720, 38)
(4060, 21)
(3700, 27)
(2710, 24)
Part (i) The outlier is a hybrid car that runs on gasoline and electric technology, but all other vehicles in the sample have engines that use gasoline only. Explain why it would be appropriate to remove the outlier from the data in this situation.
The outlier does not lie directly on the line, but it is close.
The outlier represents a different population of vehicles compared to the rest.
The outlier lies directly on the line, so the error residual (y ? ?) is zero.
The outlier is creating a curved least squares regression line.
Remove the outlier from the sample data. Find the new correlation coefficient and coefficient of determination. (Round your answers to two decimal places.)
correlation coefficient ( ) | |||
coefficient of determination ( ) |
Find the new best fit line. (Round your answers to four
decimal places.)
? = x +
Part (j) Compare the correlation coefficients and coefficients of determination before and after removing the outlier, and explain what these numbers indicate about how the model has changed.
The first linear model is a better fit, because the first correlation coefficient is closer to zero.
The new linear model is a better fit, because the new correlation coefficient is closer to zero.
The first linear model is a better fit, because the first correlation coefficient is farther from zero.
The new linear model is a better fit, because the new correlation coefficient is farther from zero.
a)
Comment: The scatter plot shows that the increase in weight decreases the fuel efficiency and vice-versa. Hence, we can conclude that weight and fuel efficiency has a negative correlation.
b)
Ans: Pearson correlation of Fuel Efficiency and Weight =
-0.702
P-Value = 0.011.
Comment: The estimated p-value is 0.011 and less than 0.05 level of significance. Hence, we can conclude that the Fuel Efficiency and Weight have a significant negative correlation.
c) The equation of the best fit line. (Round your answers to four decimal places.) is
Fuel Efficiency = 43.7590 - 0.0058 Weight
Fuel Efficiency = - 0.0058 Weight + 43.7590
? = - 0.0058 x + 43.7590
Part (d) Write the sentence that interprets the meaning of the slope of the line in the context of the data.
Ans: The slope value is - 0.0058. Hence, a unit increase in weight decreases the mean fuel efficiency by
- 0.0058.
For every one pound increase in weight, the fuel efficiency changes by the value of the slope.
Part (e) What percent of the variation in fuel efficiency is explained by the variation in the weight of the vehicles, using the regression line? (Round your answer to the nearest whole number.)
Ans: 49.3%
Part (f) Graph the best fit line on your scatterd plot.
Part (g) For the vehicle that weighs 3000 pounds, find the residual (y ? ?). (Round your answer to two decimal places.)
The fitted value for the weighs 3000 pounds is
? = - 0.0058 *3000 + 43.7590 = 26.359.
The corresponding observed fuel efficiency value of weighs 3000 pounds from the given data set is 25. Hence,
the residual (y ? ?)=25-26.359 = -1.359.
Does the value predicted by the line underestimate or overestimate
the observed data value?
underestimate
overestimate
Part (h) Identify any outliers, using either the graphical or numerical procedure demonstrated in the textbook. (Select all that apply.)
Ans:
It has an outlier. Corresponding weight and efficiency is
(2720, 38)
Part (i) The outlier is a hybrid car that runs on gasoline and electric technology, but all other vehicles in the sample have engines that use gasoline only. Explain why it would be appropriate to remove the outlier from the data in this situation.
Ans: The outlier is creating a curved least squares regression line.
Remove the outlier from the sample data. Find the new correlation coefficient and coefficient of determination. (Round your answers to two decimal places.)
Ans: The new correlation coefficient =-0.768 and coefficient of determination.0.5898.