In: Statistics and Probability
Analysis of Unobserved Component Models Using PROC UCM
Analyzes "PROC UCM" WITH STATISTICAL SOFTWARE SAS or R.
THE FOLLOWING DATA CONTAINING THE SUGAR PRICE INDEX, SINCE JANUARY 2010 UNTIL DECEMBER 2015
SHOW THE CODE USED AND THE RESULTS
DATE | valor |
ene-10 | 375.5 |
feb-10 | 360.8 |
mar-10 | 264.8 |
abr-10 | 233.4 |
may-10 | 215.7 |
jun-10 | 224.9 |
jul-10 | 247.4 |
ago-10 | 262.7 |
sep-10 | 318.1 |
oct-10 | 349.3 |
nov-10 | 373.4 |
dic-10 | 398.4 |
ene-11 | 420.2 |
feb-11 | 418.2 |
mar-11 | 372.3 |
abr-11 | 345.7 |
may-11 | 312.2 |
jun-11 | 357.7 |
jul-11 | 400.4 |
ago-11 | 393.7 |
sep-11 | 379 |
oct-11 | 361.2 |
nov-11 | 339.9 |
dic-11 | 326.9 |
ene-12 | 334.3 |
feb-12 | 342.3 |
mar-12 | 341.9 |
abr-12 | 324 |
may-12 | 294.6 |
jun-12 | 290.4 |
jul-12 | 324.3 |
ago-12 | 296.2 |
sep-12 | 283.7 |
oct-12 | 288.2 |
nov-12 | 274.5 |
dic-12 | 274 |
ene-13 | 267.8 |
feb-13 | 259.2 |
mar-13 | 262 |
abr-13 | 252.6 |
may-13 | 250.1 |
jun-13 | 242.6 |
jul-13 | 239 |
ago-13 | 241.7 |
sep-13 | 246.5 |
oct-13 | 264.8 |
nov-13 | 250.6 |
dic-13 | 234.9 |
ene-14 | 221.7 |
feb-14 | 235.4 |
mar-14 | 254 |
abr-14 | 249.9 |
may-14 | 259.3 |
jun-14 | 258 |
jul-14 | 259.1 |
ago-14 | 244.3 |
sep-14 | 228.1 |
oct-14 | 237.6 |
nov-14 | 229.7 |
dic-14 | 217.5 |
ene-15 | 217.7 |
feb-15 | 207.1 |
mar-15 | 187.9 |
abr-15 | 185.5 |
may-15 | 189.3 |
jun-15 | 176.8 |
jul-15 | 181.2 |
ago-15 | 163.2 |
sep-15 | 168.4 |
oct-15 | 197.4 |
nov-15 | 206.5 |
dic-15 | 207.8 |
Sol:
Data step to create sugars dataset.
Begin by specifying the input data set in the PROC statement. Second, use the ID statement in conjunction with the INTERVAL= statement to specify the time interval between observations. Note that the values of the ID variable are extrapolated for the forecast observations based on the values of the INTERVAL= option. Next, the MODEL statement is used to specify the dependent variable. If there are any predictors in the model, they are specified in the MODEL statement on the right-hand side of the equation. Finally, the IRREGULAR statement is used to specify the irregular component, the LEVEL and SLOPE statements are used to specify the trend component, and the CYCLE statement is used to specify the cycle component.
SAS Code:
data sugar ;
input valor @@ ;
date = intnx('month','1jan2010'd,_n_-1) ;
format date monyy5. ;
datalines ;
375.5
360.8
264.8
233.4
215.7
224.9
247.4
262.7
318.1
349.3
373.4
398.4
420.2
418.2
372.3
345.7
312.2
357.7
400.4
393.7
379
361.2
339.9
326.9
334.3
342.3
341.9
324
294.6
290.4
324.3
296.2
283.7
288.2
274.5
274
267.8
259.2
262
252.6
250.1
242.6
239
241.7
246.5
264.8
250.6
234.9
221.7
235.4
254
249.9
259.3
258
259.1
244.3
228.1
237.6
229.7
217.5
217.7
207.1
187.9
185.5
189.3
176.8
181.2
163.2
168.4
197.4
206.5
207.8
;
run ;
proc ucm data = sugar;
id date interval = month;
model valor ;
irregular ;
level ;
slope ;
cycle ;
run ;
To forecast SAS Code is:
ods html ;
ods graphics on ;
proc ucm data = sugar;
id date interval = month;
model valor ;
irregular ;
level variance=0 noest ;
slope variance=0 noest ;
cycle plot=smooth ;
estimate back=5 plot=(normal acf);
forecast lead=10 back=5 plot=decomp;
run ;
ods graphics off ;
ods html close ;
The ID, MODEL, and IRREGULAR statements appear as they did in the first model. In this model, however, you specify some specific options in the remaining component statements:
Output:
The UCM Procedure
Input Data Set | |
---|---|
Name | WORK.SUGAR |
Time ID Variable | date |
Estimation Span Summary | |||||||||
---|---|---|---|---|---|---|---|---|---|
Variable | Type | First Obs | Last Obs | NObs | NMiss | Min | Max | Mean | Standard Deviation |
valor | Dependent | JAN2010 | JUL2015 | 67 | 0 | 176.80000 | 420.20000 | 283.16567 | 63.43541 |
Forecast Span Summary | |||||||||
---|---|---|---|---|---|---|---|---|---|
Variable | Type | First Obs | Last Obs | NObs | NMiss | Min | Max | Mean | Standard Deviation |
valor | Dependent | JAN2010 | JUL2015 | 67 | 0 | 176.80000 | 420.20000 | 283.16567 | 63.43541 |
Fixed Parameters in the Model | ||
---|---|---|
Component | Parameter | Value |
Level | Error Variance | 0 |
Slope | Error Variance | 0 |
Preliminary Estimates of the Free Parameters | ||
---|---|---|
Component | Parameter | Estimate |
Irregular | Error Variance | 12623 |
Cycle | Damping Factor | 0.90000 |
Cycle | Period | 22.00000 |
Cycle | Error Variance | 7889.63785 |
Likelihood Based Fit Statistics | |
---|---|
Statistic | Value |
Diffuse Log Likelihood | -290.6 |
Diffuse Part of Log Likelihood | -6E-15 |
Non-Missing Observations Used | 67 |
Estimated Parameters | 4 |
Initialized Diffuse State Elements | 2 |
Normalized Residual Sum of Squares | 65 |
AIC (smaller is better) | 589.29 |
BIC (smaller is better) | 597.99 |
AICC (smaller is better) | 589.96 |
HQIC (smaller is better) | 592.73 |
CAIC (smaller is better) | 601.99 |
Likelihood Optimization Algorithm Converged in 15 Iterations. |
Final Estimates of the Free Parameters | |||||
---|---|---|---|---|---|
Component | Parameter | Estimate | Approx Std Error |
t Value | Approx Pr > |t| |
Irregular | Error Variance | 0.00005700 | 0.03952 | 0.00 | 0.9988 |
Cycle | Damping Factor | 0.93660 | 0.03462 | 27.05 | <.0001 |
Cycle | Period | 20.48846 | 3.08579 | 6.64 | <.0001 |
Cycle | Error Variance | 302.89658 | 156.75276 | 1.93 | 0.0533 |
Fit Statistics Based on Residuals | |
---|---|
Number of non-missing residuals used for computing the fit statistics = 65 | |
Mean Squared Error | 514.73102 |
Root Mean Squared Error | 22.68768 |
Mean Absolute Percentage Error | 5.72002 |
Maximum Percent Error | 21.24555 |
R-Square | 0.86649 |
Adjusted R-Square | 0.85992 |
Random Walk R-Square | -0.02630 |
Amemiya's Adjusted R-Square | 0.84898 |
Significance Analysis of Components (Based on the Final State) | |||
---|---|---|---|
Component | DF | Chi-Square | Pr > ChiSq |
Irregular | 1 | 0.00 | 0.9999 |
Level | 1 | 171.54 | <.0001 |
Slope | 1 | 44.91 | <.0001 |
Cycle | 2 | 1.41 | 0.4938 |
Trend Information (Based on the Final State) | ||
---|---|---|
Name | Estimate | Standard Error |
Level | 199.214661 | 15.210541 |
Slope | -2.71105724 | 0.4045468 |
Summary of Cycles | ||||||||
---|---|---|---|---|---|---|---|---|
Name | Type | Period | Frequency | Damping Factor | Final Amplitude | Percent Relative to Level | Cycle Variance | Error Variance |
Cycle | Stationary | 20.48846 | 0.30667 | 0.93660 | 20.43976 | 10.26017 | 2466.82898 | 302.89658 |
Outlier Summary | |||||||
---|---|---|---|---|---|---|---|
Obs | date | Break Type | Estimate | Standard Error |
Chi-Square | DF | Pr > ChiSq |
2 | FEB2010 | Additive Outlier | 41.88016 | 13.244587 | 10.00 | 1 | 0.0016 |
Forecasts for Variable valor | |||||
---|---|---|---|---|---|
Obs | date | Forecast | Standard Error |
95% Confidence Limits | |
68 | AUG2015 | 183.148775 | 19.90826 | 144.129300 | 222.168249 |
69 | SEP2015 | 185.746213 | 29.90993 | 127.123831 | 244.368595 |
70 | OCT2015 | 188.427390 | 37.57737 | 114.777093 | 262.077687 |
71 | NOV2015 | 190.689092 | 43.23737 | 105.945405 | 275.432779 |
72 | DEC2015 | 192.128238 | 47.05760 | 99.897029 | 284.359447 |
73 | JAN2016 | 192.466437 | 49.30378 | 95.832804 | 289.100070 |
74 | FEB2016 | 191.560125 | 50.35004 | 92.875866 | 290.244384 |
75 | MAR2016 | 189.397129 | 50.62478 | 90.174377 | 288.619881 |
76 | APR2016 | 186.081656 | 50.53206 | 87.040643 | 285.122668 |
77 | MAY2016 | 181.810467 | 50.37530 | 83.076693 | 280.544240 |
Post Sample Predictions for valor | ||||||
---|---|---|---|---|---|---|
Obs | date | Actual | Forecast | Prediction Error | Sum of Squared Errors | Sum of Absolute Errors |
68 | AUG2015 | 163.2 | 183.1487746 | -19.9487746 | 397.9536076 | 19.94877459 |
69 | SEP2015 | 168.4 | 185.7462133 | -17.3462133 | 698.8447248 | 37.29498793 |
70 | OCT2015 | 197.4 | 188.4273904 | 8.972609574 | 779.3524473 | 46.2675975 |
71 | NOV2015 | 206.5 | 190.6890917 | 15.81090831 | 1029.337269 | 62.07850581 |
72 | DEC2015 | 207.8 | 192.128238 | 15.671762 | 1274.941393 | 77.75026782 |