In: Statistics and Probability
A supermarket chain analyzed data on sales of a particular brand of snack cracker
at 104 stores for a certain one week period. The analyst decided to build a regresion model to predict the unit amount of sales of the snack cracker based on the total unit amount of sales of all brands in the snack cracker category (excluding the cracker itself).
a. Develop a linear regression model that helps predict
the cracker sales. Show the
prediction equation.
CategorySales | Sales |
1033 | 336 |
1043 | 290 |
1044 | 336 |
1053 | 295 |
1054 | 296 |
1055 | 354 |
1063 | 346 |
1067 | 328 |
1071 | 346 |
1485 | 381 |
1091 | 345 |
1095 | 338 |
1096 | 357 |
1096 | 321 |
1097 | 326 |
1099 | 340 |
1107 | 318 |
1108 | 370 |
1109 | 338 |
1110 | 388 |
1116 | 315 |
1118 | 341 |
1124 | 312 |
1124 | 355 |
1127 | 362 |
1127 | 328 |
1127 | 350 |
1130 | 346 |
1132 | 341 |
1134 | 351 |
1141 | 327 |
1143 | 371 |
1147 | 361 |
1149 | 320 |
1150 | 378 |
1150 | 352 |
1151 | 340 |
1157 | 346 |
1158 | 391 |
1164 | 364 |
1173 | 353 |
1173 | 347 |
1178 | 371 |
1184 | 390 |
1187 | 365 |
1306 | 395 |
1190 | 313 |
1193 | 364 |
1196 | 349 |
1197 | 330 |
1198 | 343 |
1204 | 367 |
1206 | 350 |
1208 | 385 |
1208 | 364 |
1211 | 329 |
1213 | 381 |
1213 | 380 |
1214 | 391 |
1214 | 367 |
1215 | 357 |
1218 | 386 |
1226 | 365 |
1228 | 361 |
1230 | 335 |
1230 | 341 |
1238 | 378 |
1238 | 376 |
1238 | 375 |
1241 | 372 |
1241 | 386 |
1248 | 353 |
1251 | 344 |
1253 | 375 |
1261 | 352 |
1263 | 368 |
1264 | 377 |
1275 | 359 |
1277 | 358 |
1278 | 368 |
1280 | 371 |
1281 | 374 |
1282 | 361 |
1285 | 402 |
1286 | 370 |
1291 | 371 |
1294 | 356 |
1297 | 375 |
1301 | 411 |
1305 | 370 |
1317 | 365 |
1320 | 375 |
1328 | 360 |
1332 | 359 |
1339 | 406 |
1348 | 394 |
1353 | 369 |
1357 | 371 |
1381 | 408 |
1401 | 372 |
1409 | 370 |
1436 | 358 |
1500 | 352 |
1459 | 396 |
b. Is there sufficient evidence at 2.5% significance
level to claim that linear
relationship exists between category sales and cracker sales? Show
the hypotheses, the
test, and make the conclusion.
c. The coefficient of determination of the linear regression model is =?? What dodes this mean?
d. The difference in category sales between store A and
store B is 150. What is the predicted
differnece of the cracker sales between these two
stores?
e. Make a prediction for sales in a week where sales in the entire snack cracker category is 1005.
f. Produce a 90% confidence prediction-interval for the
cracker sales in a store where the category
sales is 1005. Also produce a 90% confidence prediction-interval
for salees in a store
where category sales is only 900. Write down the two intervals
obtained. Now answer:
Can you determine with 90% confidence which store has higher
cracker sales?
e. This is not a statistical question, rather it is economical: Management is considering giving a discount of 5% on the selling prices of all the 'Category' items. It is assumed unit sales of all the category items will then increase by 15%. Show that revenue will increase if the discount is applied..
Note: Allowed to answer 4 subparts of one question in one post.
a. Develop a linear regression model that helps predict
cracker sales. Show the
prediction equation.
Step to run regression in excel.
Step 1 : Put the data in excel as shown.
Step 2 : Go to data -> Data Analysis -> Regression
Step 3 : Input the values as shown
Step 4 : Output will be generated as given below.
From the regression output highlighted in yellow we get the regression equation give below
y = 189.06 + 0.1396 Category Sales
b. Is there sufficient evidence at 2.5% significance
level to claim that linear
relationship exists between category sales and cracker sales? Show
the hypotheses, the
test, and make the conclusion.
For the beta coefficient, we test the following hypothesis.
Next we check the pvalue for the variable in the regression output and check if the pvalue is less than 0.025, if it is less than 0.025, then we reject the null hypothesis and conclude that the variable is significant.
In this case we find the pvalue is less than 0.025, hence we reject the null hypothesis and conclude that the variable is significant.
c. The coefficient of determination of the linear regression model is =?? What dodes this mean?
Coefficient of determination(rsqaure) = 0.3418
It is the measure of the amount of varaiblity in y explained by x. Its value lies between 0 and 1. Greater the value, better is the model. In this case, it 34.18%, hence the model is not very good.
d. The difference in category sales between store A and
store B is 150. What is the predicted
differnece of the cracker sales between these two
stores?
y = 189.06 + 0.1396 Category Sales
y = 189.06 + 0.1396*(150) = 210