Question

In: Statistics and Probability

Analysis of Unobserved Component Models Using PROC UCM Analyzes "PROC UCM" WITH STATISTICAL SOFTWARE SAS or...

Analysis of Unobserved Component Models Using PROC UCM

Analyzes "PROC UCM" WITH STATISTICAL SOFTWARE SAS or R.

THE FOLLOWING DATA CONTAINING THE SUGAR PRICE INDEX, SINCE JANUARY 2010 UNTIL DECEMBER 2015

SHOW THE CODE USED AND THE RESULTS

DATE	valor
ene-10	375.5
feb-10	360.8
mar-10	264.8
abr-10	233.4
may-10	215.7
jun-10	224.9
jul-10	247.4
ago-10	262.7
sep-10	318.1
oct-10	349.3
nov-10	373.4
dic-10	398.4
ene-11	420.2
feb-11	418.2
mar-11	372.3
abr-11	345.7
may-11	312.2
jun-11	357.7
jul-11	400.4
ago-11	393.7
sep-11	379
oct-11	361.2
nov-11	339.9
dic-11	326.9
ene-12	334.3
feb-12	342.3
mar-12	341.9
abr-12	324
may-12	294.6
jun-12	290.4
jul-12	324.3
ago-12	296.2
sep-12	283.7
oct-12	288.2
nov-12	274.5
dic-12	274
ene-13	267.8
feb-13	259.2
mar-13	262
abr-13	252.6
may-13	250.1
jun-13	242.6
jul-13	239
ago-13	241.7
sep-13	246.5
oct-13	264.8
nov-13	250.6
dic-13	234.9
ene-14	221.7
feb-14	235.4
mar-14	254
abr-14	249.9
may-14	259.3
jun-14	258
jul-14	259.1
ago-14	244.3
sep-14	228.1
oct-14	237.6
nov-14	229.7
dic-14	217.5
ene-15	217.7
feb-15	207.1
mar-15	187.9
abr-15	185.5
may-15	189.3
jun-15	176.8
jul-15	181.2
ago-15	163.2
sep-15	168.4
oct-15	197.4
nov-15	206.5
dic-15	207.8

Expert Solution

Sol:

Data step to create sugars dataset.

Begin by specifying the input data set in the PROC statement. Second, use the ID statement in conjunction with the INTERVAL= statement to specify the time interval between observations. Note that the values of the ID variable are extrapolated for the forecast observations based on the values of the INTERVAL= option. Next, the MODEL statement is used to specify the dependent variable. If there are any predictors in the model, they are specified in the MODEL statement on the right-hand side of the equation. Finally, the IRREGULAR statement is used to specify the irregular component, the LEVEL and SLOPE statements are used to specify the trend component, and the CYCLE statement is used to specify the cycle component.

SAS Code:

data sugar ;
input valor @@ ;
date = intnx('month','1jan2010'd,_n_-1) ;
format date monyy5. ;
datalines ;
375.5
360.8
264.8
233.4
215.7
224.9
247.4
262.7
318.1
349.3
373.4
398.4
420.2
418.2
372.3
345.7
312.2
357.7
400.4
393.7
379
361.2
339.9
326.9
334.3
342.3
341.9
324
294.6
290.4
324.3
296.2
283.7
288.2
274.5
274
267.8
259.2
262
252.6
250.1
242.6
239
241.7
246.5
264.8
250.6
234.9
221.7
235.4
254
249.9
259.3
258
259.1
244.3
228.1
237.6
229.7
217.5
217.7
207.1
187.9
185.5
189.3
176.8
181.2
163.2
168.4
197.4
206.5
207.8
;
run ;

proc ucm data = sugar;
id date interval = month;
model valor ;
irregular ;
level ;
slope ;
cycle ;
run ;

To forecast SAS Code is:

ods html ;
ods graphics on ;
proc ucm data = sugar;
id date interval = month;
model valor ;
irregular ;
level variance=0 noest ;
slope variance=0 noest ;
cycle plot=smooth ;
estimate back=5 plot=(normal acf);
forecast lead=10 back=5 plot=decomp;
run ;
ods graphics off ;
ods html close ;

The ID, MODEL, and IRREGULAR statements appear as they did in the first model. In this model, however, you specify some specific options in the remaining component statements:

In the LEVEL and SLOPE statements, the variances are set to zero to create a model with a fixed linear trend. A NOEST option is also included in these statements to fix the values of the model parameters.
In the CYCLE statement, you can use the PLOT= option to plot the smoothed estimate of the cycle component.
In the ESTIMATE statement, you are able to control the span of observations used in parameter estimation using the BACK= option. In this particular model, you set BACK=5 to specify a hold-out sample of five observations, which are omitted from the estimation. You can also plot the residual diagnostic plots using the PLOT= option.
In the FORECAST statement, you use the LEAD= option to specify the number of periods to forecast beyond the historical period. In this case, you select to produce 10 multi-step forecasts. The BACK= option tells PROC UCM to begin the multi-step forecast five observations back from the end of the historical data. This corresponds with the beginning of the hold-out sample period specified by the BACK= option on the ESTIMATE statement. Thus a total of 10 multi-step forecasts are produced (five corresponding with the hold-out sample and five additional forecasts into the future). Finally, use the PLOT= option to generate the series decomposition plots.
The ODS graphics on; statement invokes the ODS graphics system. The PLOT options on the CYCLE and FORECAST statements in the code cause ODS to produce high-resolution plots of the specified components. The ODS graphics off; statement turns off the graphics system. Note that the ODS Graphics System is experimental in SAS 9 and 9.1.

Output:

The UCM Procedure

Input Data Set
Name	WORK.SUGAR
Time ID Variable	date

Estimation Span Summary
Variable	Type	First Obs	Last Obs	NObs	NMiss	Min	Max	Mean	Standard Deviation
valor	Dependent	JAN2010	JUL2015	67	0	176.80000	420.20000	283.16567	63.43541

Forecast Span Summary
Variable	Type	First Obs	Last Obs	NObs	NMiss	Min	Max	Mean	Standard Deviation
valor	Dependent	JAN2010	JUL2015	67	0	176.80000	420.20000	283.16567	63.43541

Fixed Parameters in the Model
Component	Parameter	Value
Level	Error Variance	0
Slope	Error Variance	0

Preliminary Estimates of the Free Parameters
Component	Parameter	Estimate
Irregular	Error Variance	12623
Cycle	Damping Factor	0.90000
Cycle	Period	22.00000
Cycle	Error Variance	7889.63785

Likelihood Based Fit Statistics
Statistic	Value
Diffuse Log Likelihood	-290.6
Diffuse Part of Log Likelihood	-6E-15
Non-Missing Observations Used	67
Estimated Parameters	4
Initialized Diffuse State Elements	2
Normalized Residual Sum of Squares	65
AIC (smaller is better)	589.29
BIC (smaller is better)	597.99
AICC (smaller is better)	589.96
HQIC (smaller is better)	592.73
CAIC (smaller is better)	601.99

Likelihood Optimization Algorithm Converged in 15 Iterations.

Final Estimates of the Free Parameters
Component	Parameter	Estimate	Approx Std Error	t Value	Approx Pr > \|t\|
Irregular	Error Variance	0.00005700	0.03952	0.00	0.9988
Cycle	Damping Factor	0.93660	0.03462	27.05	<.0001
Cycle	Period	20.48846	3.08579	6.64	<.0001
Cycle	Error Variance	302.89658	156.75276	1.93	0.0533

Fit Statistics Based on Residuals
Number of non-missing residuals used for computing the fit statistics = 65
Mean Squared Error	514.73102
Root Mean Squared Error	22.68768
Mean Absolute Percentage Error	5.72002
Maximum Percent Error	21.24555
R-Square	0.86649
Adjusted R-Square	0.85992
Random Walk R-Square	-0.02630
Amemiya's Adjusted R-Square	0.84898

Significance Analysis of Components (Based on the Final State)
Component	DF	Chi-Square	Pr > ChiSq
Irregular	1	0.00	0.9999
Level	1	171.54	<.0001
Slope	1	44.91	<.0001
Cycle	2	1.41	0.4938

Trend Information (Based on the Final State)
Name	Estimate	Standard Error
Level	199.214661	15.210541
Slope	-2.71105724	0.4045468

Summary of Cycles
Name	Type	Period	Frequency	Damping Factor	Final Amplitude	Percent Relative to Level	Cycle Variance	Error Variance
Cycle	Stationary	20.48846	0.30667	0.93660	20.43976	10.26017	2466.82898	302.89658

Outlier Summary
Obs	date	Break Type	Estimate	Standard Error	Chi-Square	DF	Pr > ChiSq
2	FEB2010	Additive Outlier	41.88016	13.244587	10.00	1	0.0016

Forecasts for Variable valor
Obs	date	Forecast	Standard Error	95% Confidence Limits
68	AUG2015	183.148775	19.90826	144.129300	222.168249
69	SEP2015	185.746213	29.90993	127.123831	244.368595
70	OCT2015	188.427390	37.57737	114.777093	262.077687
71	NOV2015	190.689092	43.23737	105.945405	275.432779
72	DEC2015	192.128238	47.05760	99.897029	284.359447
73	JAN2016	192.466437	49.30378	95.832804	289.100070
74	FEB2016	191.560125	50.35004	92.875866	290.244384
75	MAR2016	189.397129	50.62478	90.174377	288.619881
76	APR2016	186.081656	50.53206	87.040643	285.122668
77	MAY2016	181.810467	50.37530	83.076693	280.544240

Post Sample Predictions for valor
Obs	date	Actual	Forecast	Prediction Error	Sum of Squared Errors	Sum of Absolute Errors
68	AUG2015	163.2	183.1487746	-19.9487746	397.9536076	19.94877459
69	SEP2015	168.4	185.7462133	-17.3462133	698.8447248	37.29498793
70	OCT2015	197.4	188.4273904	8.972609574	779.3524473	46.2675975
71	NOV2015	206.5	190.6890917	15.81090831	1029.337269	62.07850581
72	DEC2015	207.8	192.128238	15.671762	1274.941393	77.75026782

orchestra answered 1 year ago

Using SAS. (Unemployed Females Data) Use PROC X11to analyze the monthly unemployed females between ages 16...

Using SAS. (Unemployed Females Data) Use PROC X11to analyze the monthly unemployed females between ages 16 and 19 in the United States from January 1961 to December 1985 (in thousands). Unemployed Data: July 1 60572 August 2 52461 September 3 47357 October 4 48320 November 5 60219 December 6 84418 January 7 119916 February 8 124350 March 9 87309 April 10 57035 May 11 39903 June 12 34053 July 13 29905 August 14 28068 September 15 26634 October 16 29259...

Predictive Modeling Using Neural Networks (For SAS Enterprise Miner software) In preparation for a neural network...

Predictive Modeling Using Neural Networks (For SAS Enterprise Miner software) In preparation for a neural network model, is the imputation of missing values needed? Why or why not?

SOLVE THE FOLLOWING USING STATISTICAL SOFTWARE R. SHOW YOUR CODE AND ANSWERS, USING AN RMD FILE...

SOLVE THE FOLLOWING USING STATISTICAL SOFTWARE R. SHOW YOUR CODE AND ANSWERS, USING AN RMD FILE (SHOW ANSWERS IN R MARKDOWN FORWAT WITH CODE AND ANSWERS) PROBLEM 1 A study of 400 glaucoma patients yields a sample mean of 140 mm and a sample standard deviation of 25 mm for the the following summaries for the systolic blood pressure readings. Construct the 95% and 99% confidence intervals for μ, the population average systolic blood pressure for glaucoma patients. PROBLEM 2...

The following phases are commonly used in software process models: Requirements specification and analysis Architectural design...

The following phases are commonly used in software process models: Requirements specification and analysis Architectural design Detailed design Coding Software testing Describe the above lists and make sure the mention their respective output/deliverable.

Describe the advantages of using R to perform basic statistical analysis, as compared to using Microsoft...

Describe the advantages of using R to perform basic statistical analysis, as compared to using Microsoft Excel's Data Analysis add-in Descriptive Statistics tool. Provide specific examples that justify the advantages you have described.

Using applicable models, do a critical analysis of permanent income hypothesis and random walk models and...

Using applicable models, do a critical analysis of permanent income hypothesis and random walk models and the difference between the two model

How do I even begin to solve this using R statistical software? A random sample of...

How do I even begin to solve this using R statistical software? A random sample of eight pairs of twins was randomly assigned to treatment A or treatment B. The data are given in the following table: Twins 1 2 3 4 5 6 7 8 Treatment A 48.3 44.6 49.7 40.5 54.3 55.6 45.8 35.4 Treatment B 43.5 43.8 53.7 43.9 54.4 54.7 45.2 34.4 What is the p-value of the Wilcoxon signed-rank test? Is there any significant evidence...

When using r programming or statistical software: (A) From the summary, which variables seem useful for...

When using r programming or statistical software: (A) From the summary, which variables seem useful for predicting changes in independent variable? (B) For the purpose of variable selection, does the ANOVA table provide any useful information not already in the summary?

Provide a statistical trend analysis on the following data. Show this in a visual graph using...

Provide a statistical trend analysis on the following data. Show this in a visual graph using excel. Explain in detail the trend comparing Toys R Us to the other two corporations, Amazon and Target. Can this information be used to provide future trends in sales and possibly prevented bankruptcy for Toys R Us. Do you see any underlying patterns or behaviors? Net sales In millions (other than stores) 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 13,794 ...

SOLVE THE FOLLOWING USING STATISTICAL SOFTWARE R. SHOW YOUR CODE PROBLEM 1 A study of 400...

SOLVE THE FOLLOWING USING STATISTICAL SOFTWARE R. SHOW YOUR CODE PROBLEM 1 A study of 400 glaucoma patients yields a sample mean of 140 mm and a sample standard deviation of 25 mm for the the following summaries for the systolic blood pressure readings. Construct the 95% and 99% confidence intervals for μ, the population average systolic blood pressure for glaucoma patients. PROBLEM 2 Suppose that fasting plasma glucose concentrations (FPG) in some population are normally distributed with a mean...