In: Statistics and Probability
A store manager believes items with discounts (10% off, 20% off, etc.) sell better than items offered at full price. She also believes items with well-known brand names sell better than items not associated with a well-known brand. Furthermore, she believes that when the prices of brand-name items are discounted sales are much higher than normal.
You think this sounds like she is describing an interaction between brand-named and discounts and you want to use regression to test the effects of discounts, brands, and their interaction.
Assume that the manager provides the following IVs:
Question: What specific evidence would you need to conclude that the interaction does affect sales?
In the Problem we have to use Regression model with interaction term of categorical data.
Let Y : Sales items cost Number of sales cost of the items.
X1 : Brand of an item represented by an indicator variable.
brand=1 for items associated with a well-known brand and 0 otherwise.
X2: Discount (0,1) represented by indicator variable.
discount = 1 for items with discounted prices and 0 otherwise
Now to study interaction between brand-named and discounts , I have introduced third variable in the regression model given below,
Let X3 = Brand name × Discount
Then the regression model becomes,
Where a: estimated mean sales when data contains 0.
b1, b2, b12 are the regression coefficients.
Then we fit the multiple regression model including brands , discounts and their interactions.
Then we conduct a hypothesis test to determine whether there is a significant linear relationship between an independent variable X1, X2 , X3 and a dependent variableY.
The test procedure consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results.
Step 1) : Set null hypothesis: H0= b12=0 Vs H1 = b12 0
If the null hypothesis is rejected, then we say that iteration term X3 affect on Y (sales).
Step 2) formulate an analysis plan: It consists 2 steps:
Step 3) Analyze sample data:
i) Find the standard errors of regression coefficients a , b1, b2 , b12 by using the following formula.
Similarly calculate for b2 and b12.
ii) Find DF: The degrees of freedom is equal to
DF = n-2
iii) Test statistic. The test statistic is a t statistic (t) defined by the following equation.
t1 = b1 / SE(b1)
where b1 is the slope of the sample regression line, and SE is the standard error of the slope.
iv)
Step v) Interpret the results: This involves comparing the P-value to the significance level, and rejecting the null hypothesis when the P-value is less than the significance level.
All above steps can be summarised in table given below,
Predictors | Coeff | SEcoeff | Test statistic | P value |
Constant | SE(a) |
to |
P0 | |
Brands(X1) | SE(b1) | t1 | P1 | |
Discounts(X2) | SE(b2) | t2 | P2 | |
Interaction (X3) | SE(b12) | t3 | P3 |