Question

In: Statistics and Probability

Analysis of Unobserved Component Models Using PROC UCM Analyzes "PROC UCM" WITH STATISTICAL SOFTWARE SAS or...

Analysis of Unobserved Component Models Using PROC UCM

Analyzes "PROC UCM" WITH STATISTICAL SOFTWARE SAS or R.

THE FOLLOWING DATA CONTAINING THE SUGAR PRICE INDEX, SINCE JANUARY 2010 UNTIL DECEMBER 2015

SHOW THE CODE USED AND THE RESULTS

DATE valor
ene-10 375.5
feb-10 360.8
mar-10 264.8
abr-10 233.4
may-10 215.7
jun-10 224.9
jul-10 247.4
ago-10 262.7
sep-10 318.1
oct-10 349.3
nov-10 373.4
dic-10 398.4
ene-11 420.2
feb-11 418.2
mar-11 372.3
abr-11 345.7
may-11 312.2
jun-11 357.7
jul-11 400.4
ago-11 393.7
sep-11 379
oct-11 361.2
nov-11 339.9
dic-11 326.9
ene-12 334.3
feb-12 342.3
mar-12 341.9
abr-12 324
may-12 294.6
jun-12 290.4
jul-12 324.3
ago-12 296.2
sep-12 283.7
oct-12 288.2
nov-12 274.5
dic-12 274
ene-13 267.8
feb-13 259.2
mar-13 262
abr-13 252.6
may-13 250.1
jun-13 242.6
jul-13 239
ago-13 241.7
sep-13 246.5
oct-13 264.8
nov-13 250.6
dic-13 234.9
ene-14 221.7
feb-14 235.4
mar-14 254
abr-14 249.9
may-14 259.3
jun-14 258
jul-14 259.1
ago-14 244.3
sep-14 228.1
oct-14 237.6
nov-14 229.7
dic-14 217.5
ene-15 217.7
feb-15 207.1
mar-15 187.9
abr-15 185.5
may-15 189.3
jun-15 176.8
jul-15 181.2
ago-15 163.2
sep-15 168.4
oct-15 197.4
nov-15 206.5
dic-15 207.8

Solutions

Expert Solution

Sol:

Data step to create sugars dataset.

Begin by specifying the input data set in the PROC statement. Second, use the ID statement in conjunction with the INTERVAL= statement to specify the time interval between observations. Note that the values of the ID variable are extrapolated for the forecast observations based on the values of the INTERVAL= option. Next, the MODEL statement is used to specify the dependent variable. If there are any predictors in the model, they are specified in the MODEL statement on the right-hand side of the equation. Finally, the IRREGULAR statement is used to specify the irregular component, the LEVEL and SLOPE statements are used to specify the trend component, and the CYCLE statement is used to specify the cycle component.

SAS Code:

data sugar ;
input valor @@ ;
date = intnx('month','1jan2010'd,_n_-1) ;
format date monyy5. ;
datalines ;
375.5
360.8
264.8
233.4
215.7
224.9
247.4
262.7
318.1
349.3
373.4
398.4
420.2
418.2
372.3
345.7
312.2
357.7
400.4
393.7
379
361.2
339.9
326.9
334.3
342.3
341.9
324
294.6
290.4
324.3
296.2
283.7
288.2
274.5
274
267.8
259.2
262
252.6
250.1
242.6
239
241.7
246.5
264.8
250.6
234.9
221.7
235.4
254
249.9
259.3
258
259.1
244.3
228.1
237.6
229.7
217.5
217.7
207.1
187.9
185.5
189.3
176.8
181.2
163.2
168.4
197.4
206.5
207.8
;
run ;

proc ucm data = sugar;
id date interval = month;
model valor ;
irregular ;
level ;
slope ;
cycle ;
run ;

To forecast SAS Code is:

ods html ;
ods graphics on ;
proc ucm data = sugar;
id date interval = month;
model valor ;
irregular ;
level variance=0 noest ;
slope variance=0 noest ;
cycle plot=smooth ;
estimate back=5 plot=(normal acf);
forecast lead=10 back=5 plot=decomp;
run ;
ods graphics off ;
ods html close ;

The ID, MODEL, and IRREGULAR statements appear as they did in the first model. In this model, however, you specify some specific options in the remaining component statements:

  • In the LEVEL and SLOPE statements, the variances are set to zero to create a model with a fixed linear trend. A NOEST option is also included in these statements to fix the values of the model parameters.
  • In the CYCLE statement, you can use the PLOT= option to plot the smoothed estimate of the cycle component.
  • In the ESTIMATE statement, you are able to control the span of observations used in parameter estimation using the BACK= option. In this particular model, you set BACK=5 to specify a hold-out sample of five observations, which are omitted from the estimation. You can also plot the residual diagnostic plots using the PLOT= option.
  • In the FORECAST statement, you use the LEAD= option to specify the number of periods to forecast beyond the historical period. In this case, you select to produce 10 multi-step forecasts. The BACK= option tells PROC UCM to begin the multi-step forecast five observations back from the end of the historical data. This corresponds with the beginning of the hold-out sample period specified by the BACK= option on the ESTIMATE statement. Thus a total of 10 multi-step forecasts are produced (five corresponding with the hold-out sample and five additional forecasts into the future). Finally, use the PLOT= option to generate the series decomposition plots.
  • The ODS graphics on; statement invokes the ODS graphics system. The PLOT options on the CYCLE and FORECAST statements in the code cause ODS to produce high-resolution plots of the specified components. The ODS graphics off; statement turns off the graphics system. Note that the ODS Graphics System is experimental in SAS 9 and 9.1.

Output:

The UCM Procedure

Input Data Set
Name WORK.SUGAR
Time ID Variable date
Estimation Span Summary
Variable Type First Obs Last Obs NObs NMiss Min Max Mean Standard
Deviation
valor Dependent JAN2010 JUL2015 67 0 176.80000 420.20000 283.16567 63.43541
Forecast Span Summary
Variable Type First Obs Last Obs NObs NMiss Min Max Mean Standard
Deviation
valor Dependent JAN2010 JUL2015 67 0 176.80000 420.20000 283.16567 63.43541
Fixed Parameters in the Model
Component Parameter Value
Level Error Variance 0
Slope Error Variance 0
Preliminary Estimates of the Free Parameters
Component Parameter Estimate
Irregular Error Variance 12623
Cycle Damping Factor 0.90000
Cycle Period 22.00000
Cycle Error Variance 7889.63785
Likelihood Based Fit Statistics
Statistic Value
Diffuse Log Likelihood -290.6
Diffuse Part of Log Likelihood -6E-15
Non-Missing Observations Used 67
Estimated Parameters 4
Initialized Diffuse State Elements 2
Normalized Residual Sum of Squares 65
AIC (smaller is better) 589.29
BIC (smaller is better) 597.99
AICC (smaller is better) 589.96
HQIC (smaller is better) 592.73
CAIC (smaller is better) 601.99
Likelihood Optimization Algorithm Converged in 15 Iterations.
Final Estimates of the Free Parameters
Component Parameter Estimate Approx
Std Error
t Value Approx
Pr > |t|
Irregular Error Variance 0.00005700 0.03952 0.00 0.9988
Cycle Damping Factor 0.93660 0.03462 27.05 <.0001
Cycle Period 20.48846 3.08579 6.64 <.0001
Cycle Error Variance 302.89658 156.75276 1.93 0.0533
Fit Statistics Based on Residuals
Number of non-missing residuals used for computing the fit statistics = 65
Mean Squared Error 514.73102
Root Mean Squared Error 22.68768
Mean Absolute Percentage Error 5.72002
Maximum Percent Error 21.24555
R-Square 0.86649
Adjusted R-Square 0.85992
Random Walk R-Square -0.02630
Amemiya's Adjusted R-Square 0.84898
Significance Analysis of Components (Based on the Final State)
Component DF Chi-Square Pr > ChiSq
Irregular 1 0.00 0.9999
Level 1 171.54 <.0001
Slope 1 44.91 <.0001
Cycle 2 1.41 0.4938
Trend Information (Based on the Final State)
Name Estimate Standard
Error
Level 199.214661 15.210541
Slope -2.71105724 0.4045468
Summary of Cycles
Name Type Period Frequency Damping Factor Final Amplitude Percent Relative to Level Cycle Variance Error Variance
Cycle Stationary 20.48846 0.30667 0.93660 20.43976 10.26017 2466.82898 302.89658
Outlier Summary
Obs date Break Type Estimate Standard
Error
Chi-Square DF Pr > ChiSq
2 FEB2010 Additive Outlier 41.88016 13.244587 10.00 1 0.0016

Forecasts for Variable valor
Obs date Forecast Standard
Error
95% Confidence Limits
68 AUG2015 183.148775 19.90826 144.129300 222.168249
69 SEP2015 185.746213 29.90993 127.123831 244.368595
70 OCT2015 188.427390 37.57737 114.777093 262.077687
71 NOV2015 190.689092 43.23737 105.945405 275.432779
72 DEC2015 192.128238 47.05760 99.897029 284.359447
73 JAN2016 192.466437 49.30378 95.832804 289.100070
74 FEB2016 191.560125 50.35004 92.875866 290.244384
75 MAR2016 189.397129 50.62478 90.174377 288.619881
76 APR2016 186.081656 50.53206 87.040643 285.122668
77 MAY2016 181.810467 50.37530 83.076693 280.544240
Post Sample Predictions for valor
Obs date Actual Forecast Prediction Error Sum of Squared Errors Sum of Absolute Errors
68 AUG2015 163.2 183.1487746 -19.9487746 397.9536076 19.94877459
69 SEP2015 168.4 185.7462133 -17.3462133 698.8447248 37.29498793
70 OCT2015 197.4 188.4273904 8.972609574 779.3524473 46.2675975
71 NOV2015 206.5 190.6890917 15.81090831 1029.337269 62.07850581
72 DEC2015 207.8 192.128238 15.671762 1274.941393 77.75026782


Related Solutions

Using SAS. (Unemployed Females Data) Use PROC X11to analyze the monthly unemployed females between ages 16...
Using SAS. (Unemployed Females Data) Use PROC X11to analyze the monthly unemployed females between ages 16 and 19 in the United States from January 1961 to December 1985 (in thousands). Unemployed Data: July       1 60572 August     2 52461 September 3 47357 October    4 48320 November   5 60219 December   6 84418 January    7 119916 February   8 124350 March      9 87309 April     10 57035 May       11 39903 June      12 34053 July      13 29905 August    14 28068 September 15 26634 October   16 29259...
Predictive Modeling Using Neural Networks (For SAS Enterprise Miner software) In preparation for a neural network...
Predictive Modeling Using Neural Networks (For SAS Enterprise Miner software) In preparation for a neural network model, is the imputation of missing values needed? Why or why not?
SOLVE THE FOLLOWING USING STATISTICAL SOFTWARE R. SHOW YOUR CODE AND ANSWERS, USING AN RMD FILE...
SOLVE THE FOLLOWING USING STATISTICAL SOFTWARE R. SHOW YOUR CODE AND ANSWERS, USING AN RMD FILE (SHOW ANSWERS IN R MARKDOWN FORWAT WITH CODE AND ANSWERS) PROBLEM 1 A study of 400 glaucoma patients yields a sample mean of 140 mm and a sample standard deviation of 25 mm for the the following summaries for the systolic blood pressure readings. Construct the 95% and 99% confidence intervals for μ, the population average systolic blood pressure for glaucoma patients. PROBLEM 2...
The following phases are commonly used in software process models: Requirements specification and analysis Architectural design...
The following phases are commonly used in software process models: Requirements specification and analysis Architectural design Detailed design Coding Software testing Describe the above lists and make sure the mention their respective output/deliverable.  
Describe the advantages of using R to perform basic statistical analysis, as compared to using Microsoft...
Describe the advantages of using R to perform basic statistical analysis, as compared to using Microsoft Excel's Data Analysis add-in Descriptive Statistics tool. Provide specific examples that justify the advantages you have described.
Using applicable models, do a critical analysis of permanent income hypothesis and random walk models and...
Using applicable models, do a critical analysis of permanent income hypothesis and random walk models and the difference between the two model
How do I even begin to solve this using R statistical software? A random sample of...
How do I even begin to solve this using R statistical software? A random sample of eight pairs of twins was randomly assigned to treatment A or treatment B. The data are given in the following table: Twins 1 2 3 4 5 6 7 8 Treatment A 48.3 44.6 49.7 40.5 54.3 55.6 45.8 35.4 Treatment B 43.5 43.8 53.7 43.9 54.4 54.7 45.2 34.4 What is the p-value of the Wilcoxon signed-rank test? Is there any significant evidence...
When using r programming or statistical software: (A) From the summary, which variables seem useful for...
When using r programming or statistical software: (A) From the summary, which variables seem useful for predicting changes in independent variable? (B) For the purpose of variable selection, does the ANOVA table provide any useful information not already in the summary?
Provide a statistical trend analysis on the following data. Show this in a visual graph using...
Provide a statistical trend analysis on the following data. Show this in a visual graph using excel. Explain in detail the trend comparing Toys R Us to the other two corporations, Amazon and Target. Can this information be used to provide future trends in sales and possibly prevented bankruptcy for Toys R Us. Do you see any underlying patterns or behaviors? Net sales   In millions (other than stores)    2008   2009   2010   2011   2012   2013   2014   2015   2016   2017 13,794  ...
SOLVE THE FOLLOWING USING STATISTICAL SOFTWARE R. SHOW YOUR CODE PROBLEM 1 A study of 400...
SOLVE THE FOLLOWING USING STATISTICAL SOFTWARE R. SHOW YOUR CODE PROBLEM 1 A study of 400 glaucoma patients yields a sample mean of 140 mm and a sample standard deviation of 25 mm for the the following summaries for the systolic blood pressure readings. Construct the 95% and 99% confidence intervals for μ, the population average systolic blood pressure for glaucoma patients. PROBLEM 2 Suppose that fasting plasma glucose concentrations (FPG) in some population are normally distributed with a mean...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT