In: Statistics and Probability
In its continuing study of the 3-For-All subscription solicitation process, a marketing department team wants to test the effects of two types of structured sales presentations (personal formal and personal informal) and the number of hours spent on telemarketing on the number of new subscriptions. The staff has recorded these data for the past 24 weeks in AMS14. Analyze these data and develop a multiple regression model to predict the number of new subscriptions for a week, based on the number of hours spent on telemarketing and the sales presentation type. Write a report, giving detailed findings concerning the regression model used. (Using Excel PHSTAT) - I am stuck, unable to figure out how to imput the columns into Excel using PHStat
Week | New Subscription | Hours | Presentation |
1 |
1256 | 282 | Formal |
2 | 1405 | 336 | Formal |
3 | 1104 | 232 | Formal |
4 | 1333 | 321 | Informal |
5 | 975 | 261 | Informal |
6 | 769 | 212 | Formal |
7 | 585 | 162 | Formal |
8 | 923 | 266 | Informal |
9 | 1118 | 269 | Formal |
10 | 567 | 185 | Informal |
11 | 808 | 191 | Formal |
12 | 1005 | 226 | Formal |
13 | 1366 | 331 | Informal |
14 | 1180 | 311 | Informal |
15 | 851 | 222 | Formal |
16 | 848 | 227 | Informal |
17 | 1033 | 257 | Formal |
18 | 666 | 193 | Formal |
19 | 1289 | 325 | Formal |
20 | 840 | 216 | Informal |
21 | 995 | 251 | Formal |
22 | 1437 | 345 | Informal |
23 | 1133 | 286 | Informal |
24 | 1228 | 263 | Formal |
Solution :
Case 1 - Done
Running regression excel.
Step1 : Put the data in excel and create a dummy variable for
presentation.
Formal = 1 and informal = 0
Put the data in the format shown.
Step 2 : Go to data -> Data Analysis -> Regression
Step 3 : Input the values as shown
Step 4 : Output will be generated as given below.
From the coefficient of the regression out we get the regression equation (highlighted in yellow)
y = -277.08 + 4.86(Hours)+96.31(Presentation)
Interpretation of the coefficient.
1 unit increase in the hours increases the New Subscriptions will increase by 4.86.
If the presentation is formal than the New subscriptions increased by 96.31 compared to an informal presentation.
Significance of the variables.
For each beta coefficient, we test the following hypothesis.
Next we check the pvalue for the variable in the regression output and check if the pvalue is less than 0.05, if it is less than 0.05, then we reject the null hypothesis and conclude that the variable is significant
We see that pvalue for both the variables, Hours and Presentation,
is less than 0.05, hence we conclude that both the variables are
significant variables of y.
Checking the validity of the overall model
To check if the model is significant we test the following
hypothesis.
Ho : All the beta coefficient are equal to zero.
H1 : At least one of the beta coefficient is not equal to zero.
Now, we check the pvalue of ANOVA table.
We find that the pvalue = 0, which is less than 0.05, hence we reject the null hypothesis and conclude that the regression equation is significant.
Coefficient of determination(rsqaure) = 0.9214
It is the measure of the amount of variability in y explained by x.
Its value lies between 0 and 1. Greater the value, better is the
model. In this case, it 92.1%, hence the model is good
Let me know in the comment section if anything is not
clear. I will reply ASAP!
If you liked the answer, please give an upvote. This will be quite
encouraging for me, thank-you!