In: Statistics and Probability
To relate the stopping distance of a car to its speed, ten cars were tested at five different speeds, two cars at each speed. The following data were obtained.
Speed x (mph) 20 20 30 30 40 40 50 50 60 60
Stop. Dist. y (ft) 16.3 26.739.2 63.5 65.7 98.4 104.1 155.6 217.2 160.8
a) Fit a least-squares linear equation to these data. Plot the residuals against the speed.
b) Comment on the goodness of the fit based on the overall F-statistic and the residual plot. Which two assumptions of the linear regression model seem to be violated?
c) Based on the residual plot, what transformation of stopping distance should be used to lin- earize the relationship with respect to speed? A clue to find this transformation is provided by the following engineering argument: In bringing a car to a stop, its kinetic energy is dissipated as its braking energy, and the two are roughly equal. The kinetic energy is pro- portional to the square of the car's speed, while the breaking energy is proportional to the stopping distance, assuming a constant braking force.
d) Make this linearizing transformation and check the goodness of fit. What is the predicted stopping distance according to this model if the car is traveling at 40 mph?
a) Fit a least-squares linear equation to these data. Plot the residuals against the speed.
> #scatter plot with the regression line:
> speed=c(20,20,30,30,40,40,50,50,60,60)
>
distance=c(16.3,26.7,39.2,63.5,65.7,98.4,104.1,155.6,217.2,160.8)
> m=lm(distance~speed)
> plot(speed,distance)
> abline(m)
> #To plot the residuals agains speed:
> e=residuals(m)
> plot(speed,e)
(b) The residual graph shows systematic deviation from the assumption of normality
The mean values of ei follow a clear nonlinear curve and the variance appears to increase with the speed.
The F-statistic is 58.77 with P-value 5.9×10−5. The very small P-value means that we should reject the hypothesis that β1 = 0.
(c) The physical argument proposed is that the kinetic energy,
which is proportional to the square of speedmust
equal the dissipated energy through breaking, which is proportional
to distance
distance = β0 +β1(speed)2.
d)
regression of distance on the square of speed.
> speedsq=(speed)^2
> lm(distance~speedsq)
Call:
lm(formula = distance ~ speedsq)
Coefficients:
(Intercept) speedsq
1.62064 0.05174
So the linearized regression line is distance = 0.05+1.62×(speed)2 .
The predicted stopping distance at speed 40mph is
1.62+0.052×(40)2 = 84.34 feet.
In order to obtain the prediction interval at speed 40mph (which
the problem does not ask for) at the 95% level,
we do the following:
> m=lm(distance~speedsq)
> newdata=data.frame(speedsq=1600)
> predict(m,newdata,interval=’predict’)
fit lwr upr
1 84.40229 31.35449 137.4501
This gave the fitted value 84.40 and (the fairly wide) prediction interval [31.35,137.45].
please like my answer