In: Statistics and Probability
y=c(18.90, 17, 20, 18.25, 20.07, 11.2, 22.12, 21.47, 34.70, 30.40, 16.50, 36.50, 21.50, 19.70, 20.30, 17.80, 14.39, 14.89, 17.80, 16.41, 23.54, 21.47, 16.59, 31.90, 29.40, 13.27, 23.90, 19.73, 13.90, 13.27, 13.77, 16.50)
x1=c(350, 350, 250, 351, 225, 440, 231, 262, 89.7, 96.9, 350, 85.3, 171, 258, 140, 302, 500, 440, 350, 318, 231, 360, 400, 96.9, 140, 460, 133.6, 318, 351, 351, 360, 350)
x2=c(4, 4, 1, 2, 1, 4, 2, 2, 2, 2, 4, 2, 2, 1, 2, 2, 4, 4, 4, 2, 2, 2, 4, 2, 2, 4, 2, 2, 2, 2, 4, 4)
The le hmw7 prob4.R contains data on the gasoline mileage
performance of 32 dierent au-
tomobiles.
(a) Fit a simple linear regression model relating gasoline mileage
(y) to engine displace-
ment (x1) and carburetor (x2).
(b) Construct and interpret the partial regression plots for each
predictor.
(c) Compute the studentized residuals and the R-student residuals
for this model. What
information is conveyed by these scaled residuals?
R Studio: Program
y=c(18.90, 17, 20, 18.25, 20.07, 11.2, 22.12, 21.47, 34.70, 30.40, 16.50, 36.50, 21.50, 19.70, 20.30, 17.80, 14.39, 14.89, 17.80, 16.41, 23.54, 21.47, 16.59, 31.90, 29.40, 13.27, 23.90, 19.73, 13.90, 13.27, 13.77, 16.50)
x1=c(350, 350, 250, 351, 225, 440, 231, 262, 89.7, 96.9, 350, 85.3, 171, 258, 140, 302, 500, 440, 350, 318, 231, 360, 400, 96.9, 140, 460, 133.6, 318, 351, 351, 360, 350)
x2=c(4, 4, 1, 2, 1, 4, 2, 2, 2, 2, 4, 2, 2, 1, 2, 2, 4, 4, 4, 2,
2, 2, 4, 2, 2, 4, 2, 2, 2, 2, 4, 4)
#Linear Regression
linearreg <- lm(y~x1+x2)
summary(linearreg)
#Partial Residual plot for each predictor
library(car)
crPlots(linearreg)
#Studentized residuals
linearstudies <- studres(linearreg)
plot(x1+x2, linearstudies, ylab = "Studnardised Residuals", xlab =
"Predictors", main = "Residuals")
abline(0,0)
a)Linear regression model:
Using above P-value, we can conclude that the input variables are significant to output variable. But X2 is not significant to an output variable.
b) Residual plots:
There predictors X1 seems to be significant to an output variable but not X2. Because data distribution is not stationary. There is a overlap between Y and X1 but not with X2.
c)
Residuals must be independent to each other and should not follow any trend or pattern. As per above graph we can see that the residuals not following any trend or pattern and are totally independent.