In: Statistics and Probability
WalMart’s fiscal year starts the first week of February. This means that when analyzing the data, week 26 is actually week 30 (26+4 weeks for January) in 2002 or the end of July 2002. Also, week 52 is actually week 4 (52+4 weeks for January 2002 minus 52 weeks for 2002) in 2003 or the end of January 2003. As an example, the spike in sales (revenue) at week 75 occurs in week 27 (75+4 weeks for January 2002 minus 52 weeks for 2002) in 2003 or the first week in July 2003. This corresponds to sales for the July 4th holiday when people are buying barbecue related items.
Week | Sales in $ |
26 | 15200 |
27 | 15600 |
28 | 16400 |
29 | 15600 |
30 | 14200 |
31 | 14400 |
32 | 16400 |
33 | 15200 |
34 | 14400 |
35 | 13800 |
36 | 15000 |
37 | 14100 |
38 | 14400 |
39 | 14000 |
40 | 15600 |
41 | 15000 |
42 | 14400 |
43 | 17800 |
44 | 15000 |
45 | 15200 |
46 | 15800 |
47 | 18600 |
48 | 15400 |
49 | 15500 |
50 | 16800 |
51 | 18700 |
52 | 21400 |
53 | 20900 |
54 | 18800 |
55 | 22400 |
56 | 19400 |
57 | 20000 |
58 | 18100 |
59 | 18000 |
60 | 19600 |
61 | 19000 |
62 | 19200 |
63 | 18000 |
64 | 17600 |
65 | 17200 |
66 | 19800 |
67 | 19600 |
68 | 19600 |
69 | 20000 |
70 | 20800 |
71 | 22800 |
72 | 23000 |
73 | 20800 |
74 | 25000 |
75 | 30600 |
76 | 24000 |
77 | 21200 |
1. Identify spikes (outliers) in the data where extreme sales values occur and correlate these spikes with actual calendar dates in 2002 or 2003 and with holidays or special events that may occur during these periods.
Modeling the data linearly - a. Generate a linear model for this data by choosing two points.
b. Generate a least squares linear regression model for this data.
c. How good is this regression model? Output and discuss the R2 value.
d. What are the marginal sales (derivative, i.e. rate of change) for this department using the linear model with two data points and the regression model?
e. Compare the two models. Which do you feel is better?
f. Remove appropriate outliers as you deem necessary and rerun the linear regression model. What is the marginal sales and discuss improvements.
2. Modeling the data quadratically - a. Generate a quadratic model for this data. Also output and discuss the R2 value.
b. What are the marginal sales for this department using this model?
c. Calculate the model generated relative max/min value. Show backup analytical work.
d. Compare actual and model generated relative max/min value.
e. Remove outliers and rerun the quadratic least squares model. What is the marginal sales and discuss improvements.
3. Comparing models - a. Based on all models run, which model do you feel best predicts future trends? Explain your rationale.
b. Based on the model selected, what type of seasonal adjustments, if any, would be required to meet customer needs?