In: Statistics and Probability
Here is the data Stat7_prob4.R :
y=c(18.90, 17, 20, 18.25, 20.07, 11.2, 22.12, 21.47, 34.70, 30.40, 16.50, 36.50, 21.50, 19.70, 20.30, 17.80, 14.39, 14.89, 17.80, 16.41, 23.54, 21.47, 16.59, 31.90, 29.40, 13.27, 23.90, 19.73, 13.90, 13.27, 13.77, 16.50)
x1=c(350, 350, 250, 351, 225, 440, 231, 262, 89.7, 96.9, 350, 85.3, 171, 258, 140, 302, 500, 440, 350, 318, 231, 360, 400, 96.9, 140, 460, 133.6, 318, 351, 351, 360, 350)
x2=c(4, 4, 1, 2, 1, 4, 2, 2, 2, 2, 4, 2, 2, 1, 2, 2, 4, 4, 4, 2, 2, 2, 4, 2, 2, 4, 2, 2, 2, 2, 4, 4)
Here is the question:
Please Use R software/studio and provide all the R code and R output, please. Please answers all the questions (a, b & c). Pay attention to everything in Bold please. Show all work!
The file Stat7_prob4.R contains data on the gasoline mileage performance of 32 different automobiles.
(a) Fit a simple linear regression model relating gasoline mileage (y) to engine displacement (x1) and carburetor (x2).
(b) Construct and interpret the partial regression plots for each predictor.
(c) Compute the studentized residuals and the R-student residuals for this model. What information is conveyed by these scaled residuals?
#######################
### R Codes Solution ####
#######################
# y- dependent variable
y=c(18.90, 17, 20, 18.25, 20.07, 11.2, 22.12, 21.47, 34.70, 30.40,
16.50, 36.50, 21.50, 19.70, 20.30, 17.80, 14.39, 14.89, 17.80,
16.41, 23.54, 21.47, 16.59, 31.90, 29.40, 13.27, 23.90, 19.73,
13.90, 13.27, 13.77, 16.50)
# x1 & x2 independent variables
x1=c(350, 350, 250, 351, 225, 440, 231, 262, 89.7, 96.9, 350, 85.3,
171, 258, 140, 302, 500, 440, 350, 318, 231, 360, 400, 96.9, 140,
460, 133.6, 318, 351, 351, 360, 350)
x2=c(4, 4, 1, 2, 1, 4, 2, 2, 2, 2, 4, 2, 2, 1, 2, 2, 4, 4, 4, 2, 2,
2, 4, 2, 2, 4, 2, 2, 2, 2, 4, 4)
# Creating dataframe
data = data.frame(x1,x2,y)
# part-(a) Linear Regression Model
model = lm(y~x1+x2)
# Summarizing model
summary(model)
# Part-(b) - Partial Plot with all the predictors
# y, given x2
mod1 <- lm(y ~ x2)
resid1 <- resid(mod1)
# x1, given x2
mod2 <- lm(x1 ~ x2)
resid2 <- resid(mod2)
# y, given x1
mod3 <- lm(y ~ x1 )
resid3 <- resid(mod3)
# x2, given x1
mod4 <- lm(x2 ~ x1)
resid4 <- resid(mod4)
layout(matrix(1:4, nc=2, byrow=TRUE) )
plot(x1, y, main='y ~ x1')
plot(x2, y, main='y ~ x2')
plot(resid2, resid1, main='y|x2 ~ x1|x2')
plot(resid4, resid3, main='y|x1 ~ x2|x1')
# The partial regression plots reveal the strong relations between x1 and y but X2 and y doesn't show any good trend.
# Part -(3) studentized residuals and the R-student residuals for
linear model
# R-student residuals
rstud_resid= rstudent(model)
#studentized residuals
stud_res = rstandard(model)
# We can use the R commands ”rstandard” and ”rstudent” to
compute the studentized
# residuals and the R-student residuals, respectively. And they
have constant variance regardless of the location of xi and they
are