Question

In: Statistics and Probability

The electric power consumed each month by a chemical plant is thought to be related to...

The electric power consumed each month by a chemical plant is thought to be related to the average ambient temperature (x1), the number of days in the month (x2), the average product purity (x3), and the tons of product produced (x4). The past year’s historical data are available and are presented in the following table:

y x1 x2 x3 x4

240 25 24 91 100

236 31 21 90 95

270 45 24 88 110

274 60 25 87 88

301 65 25 91 94

316 72 26 94 99

300 80 25 87 97

296 84 25 86 96

267 75 24 88 110

276 60 25 91 105

288 50 25 90 100

261 38 23 89 98

Fit a multiple linear regression model to this data. Determine whether or not the model you have assumed can be simplified by dropping any of the terms. Calculate the ANOVA table for the model, which you have determined. What is the estimate of the variance in the electric power measurement? Plot residual plots to determine if the assumptions underlying your analysis are met. Is this normal assumption a good one here? Calculate parameter covariance and parameter correlation matrix for your model. Do you have any concerns on the basis of this matrix?

Solutions

Expert Solution

I have used R studio software to answer this question. I will explain step by step with code.

First copy the data into excel and import that in R studio by clicking import data set tab on the right side

1. Fit a multiple linear regression model to this data

Code

fit=lm(y~., data=regression)

summary(fit)

R Studio Output

Call:
lm(formula = y ~ ., data = regression)

Residuals:
Min 1Q Median 3Q Max
-14.098 -9.778 1.767 6.798 13.016

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -123.1312 157.2561 -0.783 0.459
x1 0.7573 0.2791 2.713 0.030 *
x2 7.5188 4.0101 1.875 0.103
x3 2.4831 1.8094 1.372 0.212
x4 -0.4811 0.5552 -0.867 0.415
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 11.79 on 7 degrees of freedom
Multiple R-squared: 0.852,   Adjusted R-squared: 0.7675
F-statistic: 10.08 on 4 and 7 DF, p-value: 0.00496

2. Determine whether or not the model you have assumed can be simplified by dropping any of the terms

We can remove variables if VIF is greater than 5

library(car)
vif(fit)

x1 x2 x3 x4
2.322836 2.160759 1.335410 1.008733

All variables have VIF less than 5

We can remove variables which have p value greater than 0.05

X4 has p value 0.415 so we remove that variable and run the model

fit=lm(y~.-x4, data=regression)
summary(fit)

Call:
lm(formula = y ~ . - x4, data = regression)

Residuals:
Min 1Q Median 3Q Max
-17.810 -6.770 3.257 7.978 9.531

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -162.1350 148.3154 -1.093 0.3061
x1 0.7487 0.2745 2.727 0.0260 *
x2 7.6906 3.9424 1.951 0.0869 .
x3 2.3434 1.7739 1.321 0.2230
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 11.6 on 8 degrees of freedom
Multiple R-squared: 0.8362,   Adjusted R-squared: 0.7747
F-statistic: 13.61 on 3 and 8 DF, p-value: 0.001652

Now X3 has p value 0.2230 whch is greater than 0.05, so we remove that variable and run the model

fit=lm(y~.-x4-x3, data=regression)
summary(fit)

Call:
lm(formula = y ~ . - x4, data = regression)

Residuals:
Min 1Q Median 3Q Max
-17.810 -6.770 3.257 7.978 9.531

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -162.1350 148.3154 -1.093 0.3061
x1 0.7487 0.2745 2.727 0.0260 *
x2 7.6906 3.9424 1.951 0.0869 .
x3 2.3434 1.7739 1.321 0.2230
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 11.6 on 8 degrees of freedom
Multiple R-squared: 0.8362,   Adjusted R-squared: 0.7747
F-statistic: 13.61 on 3 and 8 DF, p-value: 0.001652

> fit=lm(y~.-x4-x3, data=regression)
> summary(fit)

Call:
lm(formula = y ~ . - x4 - x3, data = regression)

Residuals:
Min 1Q Median 3Q Max
-17.051 -9.842 3.167 8.114 13.902

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.9164 81.9126 0.048 0.9629
x1 0.5727 0.2498 2.293 0.0475 *
x2 9.8824 3.7213 2.656 0.0262 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 12.07 on 9 degrees of freedom
Multiple R-squared: 0.8004,   Adjusted R-squared: 0.7561
F-statistic: 18.05 on 2 and 9 DF, p-value: 0.0007085

The model is built only on x1 and x2 by removing x4 and x3

the model is y= 0.5727*x1+ 9.8824*x2+3.9164

3. Calculate the ANOVA table for the model, which you have determined.

anova(fit)
Analysis of Variance Table

Response: y
Df Sum Sq Mean Sq F value Pr(>F)
x1 1 4233.4 4233.4 29.0470 0.000439 ***
x2 1 1027.8 1027.8 7.0523 0.026234 *
Residuals 9 1311.7 145.7   
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

4. What is the estimate of the variance in the electric power measurement

12.07 on 9 degrees of freedom

5. Plot residual plots to determine if the assumptions underlying your analysis are met.Is this normal assumption a good one here?

plot(fit,which=1)

Ideally the red line the plot should be horizontal or close to horizontal . If it shows a significant pattern , it means we might have missed non-linear components in the model. If that is the case you might want to look at pair wise relation ship between response and each predictor and figure out if any of the predictors need to be transformed to make the relationship linear between response and that predictor

plot(fit,which=2)

This tells you whether you residuals are normal or not. All points in the straight line indicate residuals are normal

plot(fit,which=3)

This plot tells us whether our residuals are homoscedastic or not. If this plot has a pattern then that mean residuals are heteroscedastic. If that is the case which is not in our case , you can again log transform response and rebuild your model. Why log transform of response in these deviations of residuals? Becuase it brings down scale of response and in turn for residuals. Effect of scale coming down is differences being less prominent.

5.Calculate parameter covariance and parameter correlation matrix for your model. Do you have any concerns on the basis of this matrix?

data1=data[,c(1,2,3)]
cor(data1)

y x1 x2
y 1.0000000 0.8025385 0.826963
x1 0.8025385 1.0000000 0.660456
x2 0.8269630 0.6604560 1.000000

cov(data1)

y x1 x2
y 597.53788 380.71970 26.33333
x1 380.71970 376.62879 16.69697
x2 26.33333 16.69697 1.69697

no concerns on the basis of this mtrix


Related Solutions

the electric power consumed each month by a chemical plant is thought to be related to...
the electric power consumed each month by a chemical plant is thought to be related to the average ambient temperature (x1) and the number of days in the month (x2). the past year’s historical data are available and are presented in the following table: y (electric power consumption) x1 (temperature) x2 (no of days) 240 25 24 236 31 21 270 45 24 274 60 25 301 65 25 316 72 26 300 80 25 296 84 25 267 75...
An electric utility is considering a new power plant in northern Arizona. Power from the plant...
An electric utility is considering a new power plant in northern Arizona. Power from the plant would be sold in the Phoenix area, where it is badly needed. Because the firm has received a permit, the plant would be legal; but it would cause some air pollution. The company could spend an additional $40 million at Year 0 to mitigate the environmental problem, but it would not be required to do so. The plant without mitigation would cost $269.63 million,...
An electric utility is considering a new power plant in northern Arizona. Power from the plant...
An electric utility is considering a new power plant in northern Arizona. Power from the plant would be sold in the Phoenix area, where it is badly needed. Because the firm has received a permit, the plant would be legal; but it would cause some air pollution. The company could spend an additional $40 million at Year 0 to mitigate the environmental problem, but it would not be required to do so. The plant without mitigation would require an initial...
An electric utility is considering a new power plant in northern Arizona. Power from the plant...
An electric utility is considering a new power plant in northern Arizona. Power from the plant would be sold in the Phoenix area, where it is badly needed. Because the firm has received a permit, the plant would be legal; but it would cause some air pollution. The company could spend an additional $40 million at Year 0 to mitigate the environmental problem, but it would not be required to do so. The plant without mitigation would require an initial...
An electric utility is considering a new power plant in northern Arizona. Power from the plant...
An electric utility is considering a new power plant in northern Arizona. Power from the plant would be sold in the Phoenix area, where it is badly needed. Because the firm has received a permit, the plant would be legal; but it would cause some air pollution. The company could spend an additional $40 million at Year 0 to mitigate the environmental problem, but it would not be required to do so. The plant without mitigation would require an initial...
An electric utility is considering a new power plant in northern Arizona. Power from the plant...
An electric utility is considering a new power plant in northern Arizona. Power from the plant would be sold in the Phoenix area, where it is badly needed. Because the firm has received a permit, the plant would be legal; but it would cause some air pollution. The company could spend an additional $40 million at Year 0 to mitigate the environmental problem, but it would not be required to do so. The plant without mitigation would require an initial...
An electric utility is considering a new power plant in northern Arizona. Power from the plant...
An electric utility is considering a new power plant in northern Arizona. Power from the plant would be sold in the Phoenix area, where it is badly needed. Because the firm has received a permit, the plant would be legal; but it would cause some air pollution. The company could spend an additional $40 million at Year 0 to mitigate the environmental problem, but it would not be required to do so. The plant without mitigation would require an initial...
An electric utility is considering a new power plant in northern Arizona. Power from the plant...
An electric utility is considering a new power plant in northern Arizona. Power from the plant would be sold in the Phoenix area, where it is badly needed. Because the firm has received a permit, the plant would be legal; but it would cause some air pollution. The company could spend an additional $40 million at Year 0 to mitigate the environmental problem, but it would not be required to do so. The plant without mitigation would cost $270.11 million,...
An electric utility is considering a new power plant in northern Arizona. Power from the plant...
An electric utility is considering a new power plant in northern Arizona. Power from the plant would be sold in the Phoenix area, where it is badly needed. Because the firm has received a permit, the plant would be legal; but it would cause some air pollution. The company could spend an additional $40 million at Year 0 to mitigate the environmental problem, but it would not be required to do so. The plant without mitigation would cost $239.88 million,...
An electric utility is considering a new power plant in northern Arizona. Power from the plant...
An electric utility is considering a new power plant in northern Arizona. Power from the plant would be sold in the Phoenix area, where it is badly needed. Because the firm has received a permit, the plant would be legal; but it would cause some air pollution. The company could spend an additional $40 million at Year 0 to mitigate the environmental problem, but it would not be required to do so. The plant without mitigation would require an initial...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT