In: Statistics and Probability
You are an owner of Tesco supermarket. You have made feature advertisings for last 3 years. You want to know the effectiveness of this feature advertising on store traffic(numbers of shoppers) in different week. In data set, you have: average numbers of shoppers, average numbers of feature advertising, and average price each week.
With the table bellow has to be done a REGRESSION model in Excel and interpret the results obtained from the equation based on the questions.
Hi Price | Lo Price | |
Hi Adv | 832 | 1102 |
625 | 888 | |
821 | 1056 | |
605 | 1407 | |
545 | 878 | |
701 | 977 | |
454 | 999 | |
605 | 1212 | |
787 | 905 | |
568 | 655 | |
Low Adv | 353 | 698 |
686 | 758 | |
455 | 987 | |
501 | 754 | |
801 | 625 | |
563 | 741 | |
423 | 532 | |
877 | 976 | |
235 | 668 | |
540 | 802 | |
No Adv | 555 | 689 |
350 | 444 | |
623 | 356 | |
421 | 690 | |
356 | 587 | |
489 | 568 | |
454 | 568 | |
423 | 452 | |
323 | 513 | |
428 |
425 |
Q1. Consider a regression model (Model I) that has feature advertising as a single independent variable with intercept. Estimate your model and interpret your estimation results.
Q2. Update above regression model (Model II) by adding an additional independent variable. average price in order to capture the effect of price promotion activities such as coupon during week. Estimate your model and interpret your estimation results. Do you think which model makes more sense between Model I and Model II? Why?
PS: PLEASE DON'T COMMENT THAT GIVEN DATA IS NOT CLEAR!!!!! GIVEN DATA IS CLEAR AND has to be done a REGRESSION model in Excel WITH THAT DATA and interpret the results obtained from the equation based on the questions. IF YOU DON'T KNOW HOW TO DO ANOVA , PLEASE DON'T ANSWER.
IF YOU ALREADY ANSWERED THIS PROBLEM, PLEASE DON'T GIVE SAME ANSWER AGAIN.
THANK YOU.
Solution :-
Though the question says that the average number of advertisement and price are given, in the dataset, only ordinal data for price and advertisement is given. So, we define dummy binary variables to take care of this situation.
Q1.
Let X = (X1, X2) be a set of binary variable such that when there is a high advertisement, X1=X2=1, when low advertisement, X1=1, X2=0, and for no advertisement X1=X2=0
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.599 | |||||
R Square | 0.359 | |||||
Adjusted R Square | 0.336 | |||||
Standard Error | 193.592 | |||||
Observations | 60.000 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 2 | 1194253.233 | 597126.6167 | 15.93277026 | 3.1881E-06 | |
Residual | 57 | 2136239.75 | 37477.89035 | |||
Total | 59 | 3330492.983 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 485.70 | 43.29 | 11.22 | 0.00 | 399.02 | 572.38 |
X1 | 163.05 | 61.22 | 2.66 | 0.01 | 40.46 | 285.64 |
X2 | 182.35 | 61.22 | 2.98 | 0.00 | 59.76 | 304.94 |
So, the regression equation is:
No. of Shoppers (Y) = 485.70 + 163.05 X1 + 182.35 X2
Where, (X1, X2) = (0,0) for no advertisement, = (1,0) for low advetisement, and (1,1) for high advertisement.
Q2.
Let X3 be another binary variable such that X3=1 when the price is high and X3=0 otherwise.
SUMMARY OUTPUT | |||||
Regression Statistics | |||||
Multiple R | 0.756 | ||||
R Square | 0.571 | ||||
Adjusted R Square | 0.548 | ||||
Standard Error | 159.757 | ||||
Observations | 60 | ||||
ANOVA | |||||
df | SS | MS | F | Significance F | |
Regression | 3 | 1901239.383 | 633746.4611 | 24.83100397 | 2.38998E-10 |
Residual | 56 | 1429253.6 | 25522.38571 | ||
Total | 59 | 3330492.983 | |||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | |
Intercept | 594.25 | 41.25 | 14.41 | 0.00 | 511.62 |
X1 | 163.05 | 50.52 | 3.23 | 0.00 | 61.85 |
X2 | 182.35 | 50.52 | 3.61 | 0.00 | 81.15 |
X3 | -217.10 | 41.25 | -5.26 | 0.00 | -299.73 |
No. of Shoppers (Y) = 594.25 + 163.05 X1 + 182.35 X2 - 217.10 X3
The second model is better predicting becasue we find the multiple R-squared values having higher value for the second value.