In: Math
Problem Details:
A major beverage company needs forecasts of sales for next year. Quarterly sales data for the previous 13 years can be found in the “Beverage_Sales” worksheet of the Excel file titled Group Case #2 Data.
Requirements (You will NOT follow the PLAN-DO-REPORT algorithm. Address/answer all requirements in the order given.):
1. (2 points) Create a Time Series plot of the sales data. Briefly describe what you see in the plot.
2. (5 points) Develop a model to forecast sales for the four quarters of year 14.
3. (2 points) Forecast sales for the four quarters of year 14.
4. (4 points) Determine the MSE, MAD, MAPE for the model you developed in requirement #2. Using this information and any relevant information from the results of requirement #2, briefly discuss how accurate you think the forecasts for year 14 are.
Data:
Year | Quarter | Sales |
1 | Q1 | 1807.37 |
1 | Q2 | 2355.32 |
1 | Q3 | 2591.83 |
1 | Q4 | 2236.39 |
2 | Q1 | 1549.14 |
2 | Q2 | 2105.79 |
2 | Q3 | 2041.32 |
2 | Q4 | 2021.01 |
3 | Q1 | 1870.46 |
3 | Q2 | 2390.56 |
3 | Q3 | 2198.03 |
3 | Q4 | 2046.83 |
4 | Q1 | 1934.19 |
4 | Q2 | 2406.41 |
4 | Q3 | 2249.06 |
4 | Q4 | 2211.56 |
5 | Q1 | 2237.05 |
5 | Q2 | 2856.43 |
5 | Q3 | 2799.57 |
5 | Q4 | 2645.33 |
6 | Q1 | 2563.59 |
6 | Q2 | 3146.52 |
6 | Q3 | 3196.68 |
6 | Q4 | 2930.48 |
7 | Q1 | 2878.96 |
7 | Q2 | 3687.85 |
7 | Q3 | 3608.33 |
7 | Q4 | 3288.26 |
8 | Q1 | 3178.23 |
8 | Q2 | 3939.69 |
8 | Q3 | 3680.11 |
8 | Q4 | 3516.65 |
9 | Q1 | 3354.76 |
9 | Q2 | 4490.02 |
9 | Q3 | 4678.97 |
9 | Q4 | 4148.56 |
10 | Q1 | 3995.07 |
10 | Q2 | 5178.43 |
10 | Q3 | 5010.64 |
10 | Q4 | 4453.38 |
11 | Q1 | 4306.70 |
11 | Q2 | 5321.93 |
11 | Q3 | 4888.10 |
11 | Q4 | 4554.65 |
12 | Q1 | 4176.79 |
12 | Q2 | 5125.40 |
12 | Q3 | 4962.65 |
12 | Q4 | 4917.63 |
13 | Q1 | 4542.60 |
13 | Q2 | 5284.71 |
13 | Q3 | 4817.43 |
13 | Q4 | 4634.50 |
ans) 1) First I have exported the excel file to R software. Then I have plotted the the data by scatterplot. The R code is given by:
d=read.csv("ts1.csv")
p1=plot(d$Period,d$Sales,type="l")
The plot is is as follows:
From the graph we can observe that the time series contains increasing linear trend component and for each 4 quarters it repeats the same pattern so it also contains the seasonal component.
2) Now the data contains both trend and seasonality so we can develop a regression model to this data. For this I have made three dummy variables namely Q1,Q2 and Q3. Q1 takes value 1 if the sales value corresponds to quarter 1 and takes value 0 otherwise. Similarly Q2 takes value 1 if the sales value corresponds to quarter 2 and takes value 0 otherwise.Similarly Q3 takes value 1 if the sales value corresponds to quarter 3 and takes value 0 otherwise. As we have seasonal period 4 so I introduced 3 dummy variable. I have also created a variable called 'Period' which takes value 1 to 52 for the 13 years and for each period ( Year 1 quarter 1 : 1, Year 1 quarter 2 : 2, Year 1 quarter 3 : 3,...Year 13 quarter 4 : 52).I have made these four variables in excel. For dummy variable I used IF() function of excel. If variable name Q1 matches value of quarter Q1 then it will take value 1 and 0 otherwise.
Now I have exported the data and developed the regression model in R software. The R code is given by:
f1=lm(Sales~Q1+Q2+Q3+Period,data=d)
summary(f1)
The result is given as:
Call:
lm(formula = Sales ~ Q1 + Q2 + Q3 + Period, data = d)
Coefficients:
(Intercept) Q1 Q2 Q3 Period
1377.59 -189.01 501.48 310.40 70.59
Call:
lm(formula = Sales ~ Q1 + Q2 + Q3 + Period, data = d)
Residuals:
Min 1Q Median 3Q Max
-497.86 -180.57 -70.24 172.01 692.05
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1377.593 115.922 11.884 9.17e-16 ***
Q1 -189.009 120.024 -1.575 0.122020
Q2 501.484 119.858 4.184 0.000124 ***
Q3 310.402 119.757 2.592 0.012677 *
Period 70.595 2.828 24.961 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 305.2 on 47 degrees of freedom
Multiple R-squared: 0.9345, Adjusted R-squared: 0.9289
F-statistic: 167.6 on 4 and 47 DF, p-value: < 2.2e-16
Now from the above result we can see that the model is pretty well fitted as Adjusted R-squared is coming out to be 0.9289
3) For forecasting the R code is given by:
Q11=c(1,0,0,0)
Q21=c(0,1,0,0)
Q31=c(0,0,1,0)
Period1=c(53,54,55,56)
x_mat=as.matrix(cbind(c(1,1,1,1),Q11,Q21,Q31,Period1))
beta=c(1377.59,-189.01,501.48,310.40,70.59)
pred1=x_mat%*%beta
The forecasting values are : Year Quarter Predicted
14 1 4929.85
14 2 5690.93
14 3 5570.44
14 4 5330.63
4)The R code for MSE, MAD and MAPE is given by:
MSE=mean(f1$residuals ^ 2)
MAD=mean(abs(f1$residuals))
MAPE=(mean((abs(f1$residuals))/(abs(d$Sales))))*100
The values are given by:
> MSE
[1] 84211.27
> MAD
[1] 230.013
> MAPE
[1] 7.572122
From the MAPE value we can see the error percentage is only 7.57% So the model is pretty good. So the forecasting values for year 14 is also pretty good.