In: Statistics and Probability
Bell Brands USA would like to predict the sales of their Babybel cheese product. They have gathered data on monthly sales of the cheese. They have also gathered information related to the average price of all cheeses in the market, the monthly advertising expenditures they expend promoting their product, and the disposable income per household in the areas surrounding the stores that sell their cheese. Below you will find output from the stepwise regression analysis. The p-value method was used with a cutoff of 0.05.
Summary measures |
|
Multiple R |
0.9513 |
R-Square |
0.9049 |
Adj R-Square |
0.8990 |
StErr of Estimate |
3924.53 |
Regression coefficients
Coefficient |
Std Err |
t-value |
p-value |
|
Constant |
-45233.64 |
8914.72 |
-5.0740 |
0.0001 |
Monthly Adv. Expenditures |
1.972 |
0.160 |
12.3405 |
0.0000 |
(A) Summarize the findings of the stepwise regression method
using this cutoff value.
(B) When the cutoff value was increased to 0.10, the output below
was the result. The table at top left represents the change when
the disposable income variable is added to the model and the table
at top right represents the average price variable being added. The
regression model with both added variables is shown in the bottom
table. Summarize the results for this model.
Disposable income variable being added |
||||
Summary measures |
|
|||
Multiple R |
0.9608 |
1.0% |
||
R-Square |
0.9232 |
2.0% |
||
Adj R-Square |
0.9130 |
1.6% |
||
StErr of Estimate |
3643.11 |
-7.2% |
||
Average price variable being added |
||||
Summary measures |
|
|||
Multiple R |
0.9723 |
1.2% |
||
R-Square |
0.9454 |
2.4% |
||
Adj R-Square |
0.9337 |
2.3% |
||
StErr of Estimate |
3179.03 |
-12.7% |
||
Regression coefficients |
||||
Coefficient |
Std Err |
t-value |
p-value |
|
Constant |
-73971.53 |
23803.23 |
-3.1076 |
0.0077 |
Monthly Adv. Expenditures |
0.952 |
0.375 |
2.5387 |
0.0236 |
Disposable Income |
2.606 |
0.977 |
2.6659 |
0.0184 |
Average Price |
-2056.27 |
861.342 |
-2.3873 |
0.0316 |
(C) Which model would you recommend using? Why?
(A) Model 1.
Here from the given Summary measures it is clear that all the value of the summary measure is high which is an indication of goodness of fit of the regression model.
On the other hand in case of Regression coefficient the given p-value is significant for both the constant and the variable namely Monthly Adv. Expenditure.
(B)Model 2.
Here in this case the cut off value is increased 0.10 and a new variable is introduced namely "Disposal Income".
Now from the given summary measure it is clear that the newly formed regression model is much better than the previous one since it increases the measure of "Multiple R", "R-square" and "Adjusted R-square" ,which is an indication of the goodness of the fit. On the other hand this model also decreases the standard error of estimate by 7.2%.
So it is clear that after the introduction of the new variable the model works good.
Model 3.
Now here another variable is introduced in the model namely "Average Price".
Here from the given summary measure for this case also indicate an increase of the measure "Multiple R", "R-square" and "Adjusted R-square" and an decrease in the the standard error of estimate by 12.7%.
Now in case of regression coefficient it is clear that the standard error is high in case of Constant and Average Price.
On the other hand p-value is significant only in case of the Constant , but p-value is not significant in case of the remaining three variable. So the above result shows that there is no significant contribution for the model of the remaining three variables.
(C) From the above analysis of the results it is clear that the first model is more useful than the others.
Since only in case of first model we get the significant p-value of the model parameter.