|
|
© 2000 John Petroff |
2) Decomposition (continued)
To decompose the trend from the data an OLS regression could be run with linear or nonlinear assumption on the seasonally adjusted data. The regression is run on the seasonally adjusted data to avoid distortions due to seasonality. The b coefficient gives the rate of growth. To simplify the calculation, time periods are coded from -tt/2 to +tt/2 where tt is the total number of time periods. That is, for 144 monthly observations from January 1985 to December 1996, January 1985 would be -72 and December 1996 would be +72. This simplifies formulas for coefficient estimates (presented in the previous section) because the sum of t (i.e. X in the previous section) is then just zero. Coefficient estimate a* (i.e. the intercept) is simply
a* = sum(Yt) / n = E(Y)
Coefficient estimate b* is
b* = sum(ttYt) / sum(tt2)
as can be verified from the derivations presented in the appendix. The calculation of the trend values Tt for each month (i.e. the fitted value with the trend) can be just as easily calculated directly in a spreadsheet by applying the formula
Tt = a + b Yt = E(Y) + Yt(sum(ttYt) / sum(tt2))
After dividing the original data by seasonal indexes (St) and trend values (Tt) what is left is any possible cyclical pattern and erratic (i.e. random) variations that may still be there. The cyclical pattern is estimated with a moving average procedure similar to the one used for seasonal variations. Naturally a length of cycle must be chosen by the analyst, and unfortunately, there is no theoretical guidance for this choice. The longer the cycle length chosen the more erratic variations will be removed.
|
After removing trend from the deseasonalized sales data, the residual that remains is shown in Graph G-5.5 below. ![]() Graph G-5.5 shows that the unusual spikes in March 97 and February 98 are outliers. An analyst may want to remove such outliers from the original actual data, replace them by the seasonally adjusted averages for those months, and undertake a new decomposition to achieve further improvement forecast quality. But even without removing outliers forecasts can be acceptable. In our example, the cyclical component is removed from the data with an assumption of a six months pattern. The final step is to show the forecasted values which are obtained by multiplying time by the estimated b* and adding a* to build trend values, then multiplying by seasonal indecis and cyclical indecis. The calculation of the forecasted values in presented in .Appendix 5b. The forecated values are plotted in Graph G-5.6 below. ![]() Graph G-5.6 clearly reveals that the forecasted values within the data sample match the actual values quiet well. It is reasonable to consider the forecasted values beyond the sample (shown in green in the graph for January through December 2000) also to be reliable predictions. |
The traditional decomposition method gives very good estimated values within the sample period because all estimates are determined with the sample itself. Beyond the sample period, projections are still good for a month or two. The accuracy is indeed so good that the US government uses this procedure (called Census II) to make forecasts of income and consumption. Beyond one year, it would be dangerous to rely on forecasts generated solely by this procedure.
See review question Q-5F5.1.
| Previous: Decomposition |
|
Next: Data smoothing |