Question

In: Statistics and Probability

(a) Fit a simple linear regression model relating gasoline mileage (y) to engine displacement (x1) and carburetor (x2).

Here is the data Stat7_prob4.R :

y=c(18.90, 17, 20, 18.25, 20.07, 11.2, 22.12, 21.47, 34.70, 30.40, 16.50, 36.50, 21.50, 19.70, 20.30, 17.80, 14.39, 14.89, 17.80, 16.41, 23.54, 21.47, 16.59, 31.90, 29.40, 13.27, 23.90, 19.73, 13.90, 13.27, 13.77, 16.50)

x1=c(350, 350, 250, 351, 225, 440, 231, 262, 89.7, 96.9, 350, 85.3, 171, 258, 140, 302, 500, 440, 350, 318, 231, 360, 400, 96.9, 140, 460, 133.6, 318, 351, 351, 360, 350)

x2=c(4, 4, 1, 2, 1, 4, 2, 2, 2, 2, 4, 2, 2, 1, 2, 2, 4, 4, 4, 2, 2, 2, 4, 2, 2, 4, 2, 2, 2, 2, 4, 4)

Here is the question:

Please Use R software/studio and provide all the R code and R output, please. Please answers all the questions (a, b & c). Pay attention to everything in Bold please. Show all work!

The file Stat7_prob4.R contains data on the gasoline mileage performance of 32 different automobiles.

(a) Fit a simple linear regression model relating gasoline mileage (y) to engine displacement (x1) and carburetor (x2).

(b) Construct and interpret the partial regression plots for each predictor.

(c) Compute the studentized residuals and the R-student residuals for this model. What information is conveyed by these scaled residuals?

Solutions

Expert Solution

#######################
### R Codes Solution ####
#######################

# y- dependent variable
y=c(18.90, 17, 20, 18.25, 20.07, 11.2, 22.12, 21.47, 34.70, 30.40, 16.50, 36.50, 21.50, 19.70, 20.30, 17.80, 14.39, 14.89, 17.80, 16.41, 23.54, 21.47, 16.59, 31.90, 29.40, 13.27, 23.90, 19.73, 13.90, 13.27, 13.77, 16.50)

# x1 & x2 independent variables
x1=c(350, 350, 250, 351, 225, 440, 231, 262, 89.7, 96.9, 350, 85.3, 171, 258, 140, 302, 500, 440, 350, 318, 231, 360, 400, 96.9, 140, 460, 133.6, 318, 351, 351, 360, 350)
x2=c(4, 4, 1, 2, 1, 4, 2, 2, 2, 2, 4, 2, 2, 1, 2, 2, 4, 4, 4, 2, 2, 2, 4, 2, 2, 4, 2, 2, 2, 2, 4, 4)

# Creating dataframe
data = data.frame(x1,x2,y)

# part-(a) Linear Regression Model
model = lm(y~x1+x2)
# Summarizing model
summary(model)


# Part-(b) - Partial Plot with all the predictors
# y, given x2
mod1 <- lm(y ~ x2)
resid1 <- resid(mod1)

# x1, given x2
mod2 <- lm(x1 ~ x2)
resid2 <- resid(mod2)

# y, given x1
mod3 <- lm(y ~ x1 )
resid3 <- resid(mod3)

# x2, given x1
mod4 <- lm(x2 ~ x1)
resid4 <- resid(mod4)


layout(matrix(1:4, nc=2, byrow=TRUE) )
plot(x1, y, main='y ~ x1')
plot(x2, y, main='y ~ x2')
plot(resid2, resid1, main='y|x2 ~ x1|x2')
plot(resid4, resid3, main='y|x1 ~ x2|x1')

# The partial regression plots reveal the strong relations between x1 and y but X2 and y doesn't show any good trend.


# Part -(3) studentized residuals and the R-student residuals for linear model

# R-student residuals
rstud_resid= rstudent(model)

#studentized residuals
stud_res = rstandard(model)

# We can use the R commands ”rstandard” and ”rstudent” to compute the studentized
# residuals and the R-student residuals, respectively. And they have constant variance regardless of the location of xi and they are


Related Solutions

Fit a multiple linear regression model of the form y=β0 + β1 x1 + β2 x2...
Fit a multiple linear regression model of the form y=β0 + β1 x1 + β2 x2 + β3 x3 + ε. Here, ε is the random error term that is assumed to be normally distributed with 0 mean and constant variance. State the estimated regression function. How are the estimates of the three regression coefficients interpreted here? Provide your output, and interpretations in a worksheet titled “Regression Output.” Obtain the residuals and prepare a box-plot of the residuals. What information...
A regression model of the form y = beta0 + beta1 x1 + beta2 x2 +...
A regression model of the form y = beta0 + beta1 x1 + beta2 x2 + beta3 x3 + E was built using 20 observations. Partially completed regression output tables are provided below. What are the values of A, B, and C? Table 1 Statistic Value R-Square A Adjusted R-Square B Standard Error (RMSE) C n 20 Table 2 Source DF SS MS F P-Value Regression D 175 H J K Error E G I Total F 250 A regression...
Using Excel generate a simple regression model with Y as the dependent variable and X1 and...
Using Excel generate a simple regression model with Y as the dependent variable and X1 and X2 as the independent variables in the attached spreadsheet. Write the following from the output: Intercept: Coefficients of Independent variable: R-square: Significance F: Based on the model generated, forecast profits for a firm with X1= Based on the model generated, forecast profits for a firm with x1=250 and X2=100. Evaluate the predictability of the model using explanatory language that someone who does not have...
6. In the simple linear regression model, the y-intercept represents the: a. change in y per...
6. In the simple linear regression model, the y-intercept represents the:a. change in y per unit change in x.b. change in x per unit change in y.c. value of y when x=0.d. value of x when y=07. In the simple linear regression model, the slope represents the:a. value of y when x=0.b. average change in y per unit change in x.c. value of x when y=0.d. average change in x per unit change in y.8. In regression analysis, the residuals...
A least-squares simple linear regression model was fit predicting duration (in minutes) of a dive from...
A least-squares simple linear regression model was fit predicting duration (in minutes) of a dive from depth of the dive (in meters) from a sample of 45 penguins' diving depths and times. Calculate the F-statistic for the regression by filling in the ANOVA table. SS df MS F-statistic Regression Residual 1628.4056 Total 367385.9237
A simple linear regression model relating a bank lending interest rate and investment in physical capital...
A simple linear regression model relating a bank lending interest rate and investment in physical capital by companies is stated as: Which variable (lending interest rate or investment in physical capital) do you think should be the dependent variable in this regression model? Please justify your answer.                                                        [2 points] What sign would you expect for the slope of this regression model for interest rate and investment in physical capita? Please justify your answer.                     What is the role of...
You are developing a simple linear regression analysis model. The simple correlation coefficient between y and...
You are developing a simple linear regression analysis model. The simple correlation coefficient between y and x is -0.72. What do you know must be true about b1. The least squares estimator of B1? Why? In a multiple linear regression analysis with k = 3. From the t test associated with B1, you conclude that B1 = 0. When you do the f test will you reject or fail to reject the null hypothesis? Why? In a simple bilinear regression...
Use Excel to estimate a simple linear regression model for the following data (Y is a...
Use Excel to estimate a simple linear regression model for the following data (Y is a dependent variable and X is an independent variable): Y X 0 -2 0 -1 1 0 1 1 3 2 Fill in Multiple Blanks: What is the slope of the estimated line?  In your answer, show one (1) digit to the right of the decimal point, for example, 1.0, 1.2. Apply the appropriate rounding rule if necessary. What is the Y-intercept?
Question 4 A simple linear regression model was used in order to predict y, duration of...
Question 4 A simple linear regression model was used in order to predict y, duration of relief from allergy, from x, dosage of medication. A total of n=10 subjects were given varying doses, and their recovery times noted. Here is the R output. summary(lmod4) ## ## Call: ## lm(formula = y ~ x) ## ## Residuals: ##     Min      1Q Median      3Q     Max ## -3.6180 -1.9901 -0.4798 2.2048 3.7385 ## ## Coefficients: ##             Estimate Std. Error t value Pr(>|t|)    ## (Intercept)...
When we estimate a linear multiple regression model (including a linear simple regression model), it appears...
When we estimate a linear multiple regression model (including a linear simple regression model), it appears that the calculation of the coefficient of determination, R2, for this model can be accomplished by using the squared sample correlation coefficient between the original values and the predicted values of the dependent variable of this model. Is this statement true? If yes, why? If not, why not? Please use either matrix algebra or algebra to support your reasoning.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT