In: Statistics and Probability
From the following observations of annual EPS for a company, what is the first-order autocorrelation?
Time | EPS |
1 | 3.06 |
2 | 3.27 |
3 | 4.44 |
4 | 4.07 |
5 | 5.26 |
6 | 5.59 |
Note that using the CORREL() spreadsheet function will not produce the correct result. Though for a large sample it'll be really close, for a small sample such as this one the difference can be significant. This is because for the autocorrelation you use the variance of the full sample in the denominator, as well as use the full sample to find the mean for use in covariance. In contrast, the CORREL function will treat the two subsets of the time series as separate data, and calculate separate mean and standard deviation for each.
Answer should be 0.4 Please show who to get it in excel
Firstly I am pasting the excel sheet and then I will be explaining everything below the spreadsheet.
TIME | EPS | ||||||
1 | 3.06 | ||||||
2 | 3.27 | 3.06 | Number of Data points | 6 | |||
3 | 4.44 | 3.27 | |||||
4 | 4.07 | 4.44 | |||||
5 | 5.26 | 4.07 | Lag | Autocorrelation | |||
6 | 5.59 | 5.26 | 1 | 0.402522935 | |||
2 | -0.01928227 | ||||||
3 | -0.099722153 | ||||||
4 | -0.479342766 | ||||||
5 | -0.304175 |
First step is type the data given already then leaving the first row of the EPS start writing the EPS once again in the adjacent column i.e. 3.06 will go in the next row in the adjacent column and similarly for the other values as well.
Now the autocorrelation formula states
Now select the column with the heading eps and on the top Left hand side below clipboard there edit the cell number to DATA. (Trust me this is going to help so just do all these steps of editing the title as well)
Now label a cell as number of Data Points and the formula for it will be =COUNT(DATA). [here you will not have to select the data under EPS just write the term DATA and it will be automatically selected]. Also edit the title of this result as n_data_points just like we did for DATA.
Now label lag and autocorrelation
the formula for autocorrelation is
=SUMPRODUCT(OFFSET(DATA,0,0,n_data_points-G7)-AVERAGE(DATA),OFFSET(DATA,G7,0,n_data_points-G7)-AVERAGE(DATA))/DEVSQ(DATA)
Just compare this to the formula above.
First part of the numerator
OFFSET is used to return a range that is a number of rows and columns from a reference cell or range. [This is the reason we wrote the values in the adjacent column but one row below]
OFFSET(reference,rows,column,[height],[width]). So here the reference is DATA and rows and column will be 0 and 0 respectively.
also the range of the sumproduct varies from 1 to N-k so here we have written n_data_points - G7 where n_data_points is referring to number of data points and G7 in my case was cell for lag 1.
Subtracting the average data (just like in the formula)
Now for the other part of the numerator we will be doing the same process but here the rows of the offset function will be again lag 1 (in my case G7) now this is calculating for the other column we created.
the denominator DEVSQ function for sum of squares of deviations for the DATA.
And you will get the result
just drag this formula for the other lags and you will get your autocorrelations.