In: Statistics and Probability
Eli Orchid has designed a new pharmaceutical product, Orchid Relief, which improves the night sleep. Before initiating mass production of the product, Eli Orchid has been market-testing Orchid Relief in Orange County over the past 8 weeks. The daily demand values are recorded in the Excel file provided. Eli Orchid plans on using the sales data to predict sales for the upcoming week. An accurate forecast would be helpful in making arrangements for the company’s production processes and designing promotions.
Before a forecasting model is built and a forecast for the next week is generated, the COO of the company has asked the data analyst for an exploratory analysis of the demand.
Specifically, the COO has asked the analyst[1]:
To provide a bar chart (with data labels rounded to two decimal points) showing the average demand for each week day (Sun., Mon., etc.) |
[add chart here] |
To fit a simple linear regression model to the data and to provide its equation (d = a*t + b), along with R2 |
d = R2= |
To fit a multiple regression model with a dummy variable representing the weekend, and to provide the regression equation (d = a*t + b*w + c), along with Adjusted R2. |
d = Adjusted R2= |
To provide a run-series plot of the actual demand with simple regression and multiple regression overlay. |
[add chart here] |
To write a short paragraph explaining the observations and providing general recommendations for the next seven days demand forecast. Note: this paragraph can be on page 2. The answers to previous questions must all fit on the first page. |
[write your paragraph here] |
[1] Round numbers to four decimal points (e.g. 0.1234), unless explicitly requested otherwise.
Day | Date | Weekday | Daily Demand | Weekend |
1 | 4/25/2016 | Mon | 297 | 0 |
2 | 4/26/2016 | Tue | 293 | 0 |
3 | 4/27/2016 | Wed | 327 | 0 |
4 | 4/28/2016 | Thu | 315 | 0 |
5 | 4/29/2016 | Fri | 348 | 0 |
6 | 4/30/2016 | Sat | 447 | 1 |
7 | 5/1/2016 | Sun | 431 | 1 |
8 | 5/2/2016 | Mon | 283 | 0 |
9 | 5/3/2016 | Tue | 326 | 0 |
10 | 5/4/2016 | Wed | 317 | 0 |
11 | 5/5/2016 | Thu | 345 | 0 |
12 | 5/6/2016 | Fri | 355 | 0 |
13 | 5/7/2016 | Sat | 428 | 1 |
14 | 5/8/2016 | Sun | 454 | 1 |
15 | 5/9/2016 | Mon | 305 | 0 |
16 | 5/10/2016 | Tue | 310 | 0 |
17 | 5/11/2016 | Wed | 350 | 0 |
18 | 5/12/2016 | Thu | 308 | 0 |
19 | 5/13/2016 | Fri | 366 | 0 |
20 | 5/14/2016 | Sat | 460 | 1 |
21 | 5/15/2016 | Sun | 427 | 1 |
22 | 5/16/2016 | Mon | 291 | 0 |
23 | 5/17/2016 | Tue | 325 | 0 |
24 | 5/18/2016 | Wed | 354 | 0 |
25 | 5/19/2016 | Thu | 322 | 0 |
26 | 5/20/2016 | Fri | 405 | 0 |
27 | 5/21/2016 | Sat | 442 | 1 |
28 | 5/22/2016 | Sun | 454 | 1 |
29 | 5/23/2016 | Mon | 318 | 0 |
30 | 5/24/2016 | Tue | 298 | 0 |
31 | 5/25/2016 | Wed | 355 | 0 |
32 | 5/26/2016 | Thu | 355 | 0 |
33 | 5/27/2016 | Fri | 374 | 0 |
34 | 5/28/2016 | Sat | 447 | 1 |
35 | 5/29/2016 | Sun | 463 | 1 |
36 | 5/30/2016 | Mon | 291 | 0 |
37 | 5/31/2016 | Tue | 319 | 0 |
38 | 6/1/2016 | Wed | 333 | 0 |
39 | 6/2/2016 | Thu | 339 | 0 |
40 | 6/3/2016 | Fri | 416 | 0 |
41 | 6/4/2016 | Sat | 475 | 1 |
42 | 6/5/2016 | Sun | 459 | 1 |
43 | 6/6/2016 | Mon | 319 | 0 |
44 | 6/7/2016 | Tue | 326 | 0 |
45 | 6/8/2016 | Wed | 356 | 0 |
46 | 6/9/2016 | Thu | 340 | 0 |
47 | 6/10/2016 | Fri | 395 | 0 |
48 | 6/11/2016 | Sat | 465 | 1 |
49 | 6/12/2016 | Sun | 453 | 1 |
50 | 6/13/2016 | Mon | 307 | 0 |
51 | 6/14/2016 | Tue | 324 | 0 |
52 | 6/15/2016 | Wed | 350 | 0 |
53 | 6/16/2016 | Thu | 348 | 0 |
54 | 6/17/2016 | Fri | 384 | 0 |
55 | 6/18/2016 | Sat | 474 | 1 |
56 | 6/19/2016 | Sun | 485 | 1 |
The bar chart can be plotted by rearranging the daily demand data into weekdays, and then calculating the average weekday demand. The plot is as shown below:
The regression equation can be determined by assigning a number to each weekday, 1 for Monday, 2 for Tuesday, and so on..
The scatter plot is as shown below.By choosing to draw trendline and opting to display the trendline equation on chart, we have the regression equation as: d = 27.589 t + 258.45
R2 = 0.8891
c. The multiple regression equation can be found by running a Regression in MS Excel. The summary out is as shown below. The regression equation can be written as d = 16.7738 t + 60.5667 w + 284.4036
Adjusted R2 = 0.9567
d. The overlaid plot is as shown below along with the data table:
We can see that the multiple regression model predicts the actual demand very well. So it's apt to use the multiple regression equation for forecasting the daily demand. The high adjusted R2 value also points to a high degree of correlation between the forecast and actual demand.