In: Statistics and Probability
Delmarva Power is a utility company that would like to predict the monthly heating bill for a household in Kent County during the month of January. A random sample of 18 households in the county were selected and their January heating bill recorded. This data is shown in the table below along with the square footage of the house (SF), the age of the heating system in years (Age,) and the type of heating system (heat pump = 1 or natural gas = 0).
Household |
Bill |
SF |
Age |
Type |
1 |
$255 |
2,100 |
7 |
Natural Gas |
2 |
$286 |
1,900 |
17 |
Natural Gas |
3 |
$296 |
2,000 |
8 |
Natural Gas |
4 |
$300 |
2,300 |
22 |
Natural Gas |
5 |
$305 |
3,000 |
5 |
Natural Gas |
6 |
$317 |
2,700 |
14 |
Natural Gas |
7 |
$321 |
1,500 |
8 |
Natural Gas |
8 |
$321 |
2,800 |
3 |
Natural Gas |
9 |
$339 |
2,550 |
20 |
Natural Gas |
10 |
$349 |
2,500 |
11 |
Natural Gas |
11 |
$369 |
2,100 |
12 |
Heat Pump |
12 |
$374 |
2,500 |
18 |
Heat Pump |
13 |
$381 |
2,300 |
19 |
Heat Pump |
14 |
$413 |
2,500 |
17 |
Heat Pump |
15 |
$419 |
3,200 |
11 |
Heat Pump |
16 |
$441 |
3,100 |
8 |
Heat Pump |
17 |
$522 |
2,500 |
20 |
Heat Pump |
18 |
$560 |
3,550 |
18 |
Heat Pump |
Questions:
a) Develop a regression equation that will predict the monthly heating bill for a household in Kent County during the month of January based on the square footage of the house, the age of the heating system, and the type of heating system.
b) Interpret the meaning of the regression coefficients for the heating bill model.
c) Predict the monthly heating bill for a house that has 2,700 square feet and has a heat pump that is tenj years old.
d) Construct a 95% confidence interval to estimate the average monthly heating bill for a house that has 2,700 square feet and has a heat pump that is ten years old.
e) Construct a 95% prediction interval to estimate the monthly heating bill for a specific house that has 2,700 square feet and has a heat pump that is ten years old.
f) Show the calculations for the multiple coefficient of determination for the heating bill model and interpret its meaning.
g) Conduct the hypothesis test, showing the calculations, to test the significance of the overall regression model for predicting a heating bill using ? = 0.05.
h) Show the calculations for the adjusted multiple coefficient of determination for predicting a heating bill for a house in Kent County during the month of January.
i) Show the calculations for the test statistic for each regression coefficient for the heating bill model using ? = 0.05 and interpret the results.
j) Show the calculations for the 95% confidence intervals to estimate the population regression coefficients for the heating model and interpret their meaning.
Result:
Questions:
a) Develop a regression equation that will predict the monthly heating bill for a household in Kent County during the month of January based on the square footage of the house, the age of the heating system, and the type of heating system.
The regression equation is
Bill = 145.8173+ 0.0582* SF + 2.3687* Age + 94.4710* Type
b) Interpret the meaning of the regression coefficients for the heating bill model.
When there is a one square feet increase, there is an increase of $0.0582 increase in Bill.
When there is a increase of age by 1, there is an increase of $ 2.3687 increase in Bill.
When there is a heat pump present in the house, there is an increase of $94.4710 increase in Bill.
c) Predict the monthly heating bill for a house that has 2,700 square feet and has a heat pump that is ten years old.
Predicted Bill = 145.8173+ 0.0582* 2700 + 2.3687* 10 + 94.4710* 1
=$421.053
d) Construct a 95% confidence interval to estimate the average monthly heating bill for a house that has 2,700 square feet and has a heat pump that is ten years old.
95% CI = ($379.816, $462.289)
e) Construct a 95% prediction interval to estimate the monthly heating bill for a specific house that has 2,700 square feet and has a heat pump that is ten years old.
95% PI = ($316.377, $525.729)
f) Show the calculations for the multiple coefficient of determination for the heating bill model and interpret its meaning.
R square = 83886.2096/112057.7778 = 0.7486
74.86% of variation in the Bill is explained by the model.
g) Conduct the hypothesis test, showing the calculations, to test the significance of the overall regression model for predicting a heating bill using ? = 0.05.
ANOVA table |
|||||
Source |
SS |
df |
MS |
F |
p-value |
Regression |
83,886.2096 |
3 |
27,962.0699 |
13.90 |
.0002 |
Residual |
28,171.5681 |
14 |
2,012.2549 |
||
Total |
112,057.7778 |
17 |
Calculated F= 13.90 > critical F(3,14) at 0.05 level 3.34.
Ho is rejected. The overall model is significant.
h) Show the calculations for the adjusted multiple coefficient of determination for predicting a heating bill for a house in Kent County during the month of January.
Adjusted R square = 1-(1-0.7486)*17/(18-3-1) = 0.6947
i) Show the calculations for the test statistic for each regression coefficient for the heating bill model using ? = 0.05 and interpret the results.
Test for coefficient SF, t=0.0582/0.0237 =2.456, P=0.0277 which is < 0.05 level. Ho is rejected.. SF is significant.
Test for coefficient Age, t=2.3687/2.0065 =1.180, P=0.2575 which is > 0.05 level. Ho is not rejected.. Age is not significant.
Test for coefficient Type, t=94.471/24.8928 =3.795, P=0.002 which is < 0.05 level. Ho is rejected.. Type is significant.
j) Show the calculations for the 95% confidence intervals to estimate the population regression coefficients for the heating model and interpret their meaning.
variables |
coefficients |
std. error |
95% lower |
95% upper |
Intercept |
145.8173 |
64.9862 |
6.4358 |
285.1988 |
SF |
0.0582 |
0.0237 |
0.0074 |
0.1090 |
Age |
2.3687 |
2.0065 |
-1.9348 |
6.6722 |
Type |
94.4710 |
24.8928 |
41.0814 |
147.8607 |
Regression Analysis |
|||||||
R² |
0.749 |
||||||
Adjusted R² |
0.695 |
n |
18 |
||||
R |
0.865 |
k |
3 |
||||
Std. Error |
44.858 |
Dep. Var. |
Bill |
||||
ANOVA table |
|||||||
Source |
SS |
df |
MS |
F |
p-value |
||
Regression |
83,886.2096 |
3 |
27,962.0699 |
13.90 |
.0002 |
||
Residual |
28,171.5681 |
14 |
2,012.2549 |
||||
Total |
112,057.7778 |
17 |
|||||
Regression output |
confidence interval |
||||||
variables |
coefficients |
std. error |
t (df=14) |
p-value |
95% lower |
95% upper |
|
Intercept |
145.8173 |
64.9862 |
2.244 |
.0415 |
6.4358 |
285.1988 |
|
SF |
0.0582 |
0.0237 |
2.456 |
.0277 |
0.0074 |
0.1090 |
|
Age |
2.3687 |
2.0065 |
1.180 |
.2575 |
-1.9348 |
6.6722 |
|
Type |
94.4710 |
24.8928 |
3.795 |
.0020 |
41.0814 |
147.8607 |
|
Predicted values for: Bill |
|||||||
95% Confidence Interval |
95% Prediction Interval |
||||||
SF |
Age |
Type |
Predicted |
lower |
upper |
lower |
upper |
2,700 |
10 |
1 |
421.053 |
379.816 |
462.289 |
316.377 |
525.729 |