Example 4: Seasonal and Non-seasonal Exponential Smoothing
Example 2 discusses the
analysis of a data set from the classic book on ARIMA by Box and Jenkins
(1976). The data are monthly passenger totals (measured in thousands)
in international air travel, for twelve consecutive years: 1949-1960 (see
Box and Jenkins, 1976, page 531, "Series G"). The Series_G.sta
data file is partially listed below. Open this data file via the File - Open Examples
menu; it is in the Datasets folder.

The ARIMA analysis required a good deal of preparatory work during the
identification stage. In fact, it usually requires a lot of experience
and familiarity not only with ARIMA but also with the nature of the data,
in order to identify satisfactory models. Often, the purpose of ARIMA
is mostly to derive forecasts, and the interpretation of the nature of
the model (i.e., the number and types of parameters) is only of secondary
interest. In those cases, exponential smoothing provides a much easier
alternative, one that usually produces forecasts of equal or better quality
(see the Introductory
Overview for a discussion of this point).
In this example, exponential smoothing will be performed on the same
series used in Example 2 and
the forecasts derived by the two methods will be compared.
Choosing a Model.
Even though exponential smoothing is, in a way, a simpler method than
ARIMA, some choices still have to be made. Select Time
Series/Forecasting from the Statistics - Advanced Linear/Nonlinear
Models menu to display the Time
Series Analysis Startup Panel. Then, click the Variables
button to display the standard variable
selection dialog. Here, select the variable Series_G
(note that if the data file Series_G.sta
is the currently open data file, and since Series_G
is the only variable in that data file, then when the
Time Series Analysis dialog opens, Series_G
will automatically be selected). Click the OK
button on the variable selection dialog to return to the Startup Panel.
Now, click the Exponential smoothing
& forecasting button to proceed to the Seasonal
and Non-Seasonal Exponential Smoothing dialog.

As described in the Introductory
Overview, there are different exponential smoothing models available.
In general, in all models the smoothed or forecasted values are computed
as a weighted average of the preceding values. The difference between
the models listed in the Model
box is whether or not a trend and/or seasonal component are smoothed with
extra smoothing parameters. Examine the differences between models by
looking at the results of smoothing with the different techniques and
parameters. However, first plot the series. Because Series_G.sta
contains dates in the case names, use those to label the horizontal x-axis
in line plots. Click on the Review series tab and select the
Case names option button. Then
select the Scale X axis in plots manually
check box and specify Min = 1
and Step = 12 (as there are 12
months in a year). Now, click the Plot
button next to the Review highlighted
variable button.

The data in this series are easily matched up with the general "model
shapes" shown on the icons in the Model
box of the Seasonal and Non-Seasonal Exponential Smoothing
dialog - Advanced tab.
Clearly, there is a trend, which is more or less linear. Second, there
is seasonal fluctuation; that is, every year the number of airline passengers
follows an almost identical pattern (e.g., most travel occurs during the
summer vacation months). This seasonality is multiplicative rather than
additive in nature: The higher the overall level of the series the greater
is the seasonal fluctuation. Put another way, the increase in airline
passenger loads during the summer months each year can best be expressed
by a factor; for example, each summer the passenger load increases by
a factor of 1.1, or 10%. Thus, the Winters
model (Linear trend, Multiplicative)
is probably the best exponential smoothing model to use for this series.
However, first look at some other models.
Simple Exponential
Smoothing. The Forecast
box will show 10 cases
by default; change this to 12
and forecast one full year. Then, accept all other defaults and click
the Summary: Exponential smoothing
button. Shown below is the plot of the original and smoothed series, and
the residuals.

Two things are immediately apparent. First, the smoothed series traces
the general linear trend but fails to follow the seasonal cycles. Second,
all forecasts are the same. In fact, if you look back over the description
of simple exponential smoothing in the Introductory
Overview, this could be expected: Each smoothed value or forecast
St is computed
as St = St-1 +α * e, where
e is the error or observed minus
predicted (smoothed or forecasted) value. When there are no observed values
available (e.g., when computing forecasts), then e
is assumed to be 0 (zero). Thus, all forecasts are the same from then
on.
The α
(alpha) parameter. Now look at the effect
of the smoothing parameter α (alpha).
Looking at the formula above, it is clear that as α approaches 0
(zero), all smoothed values will become very similar; when α approaches
1, then the smoothed values should very closely follow the actual observed
data. Set the Alpha parameter
to .900 on the
Advanced tab and select the original
Series_G variable in the active
work area.

Click the Summary: Exponential smoothing
button again.

Now the smoothed values follow the observed values very closely; it
almost looks like the original series is simply moved by one observation
to the right. Indeed, if you were to set the α (Alpha)
parameter to 1, then each smoothed
value would be equal to the previous observation. Therefore, in a sense,
the α parameter can be considered a stiffness parameter. The smaller
the α the "stiffer" the smoothed line; that is, the smoothed
line will not be affected as much by the random observation-to-observation
variability. The larger the α the more flexible the smoothed line;
that is, the more closely will it follow the fluctuations in the observed
values. This is generally true for all exponential smoothing models, and
this principle applies equally to the seasonality and trend smoothing
parameters of the more complex models (see below).
Exponential Smoothing
with Linear Trend. Now select Holt
Linear trend smoothing
(without seasonality). In this model, a trend component is independently
smoothed with parameter γ (Gamma).
If γ is set to 0, then a constant slope will be included in the computation
of smoothed values and forecasts. If γ is set to 1, then the slope
is recomputed at each observation from the respective immediately preceding
smoothed value; thus, the slope is allowed to change as much as necessary
from observation to observation, in order to approximate the observed
values. Shown below is the summary plot for two smoothing trials, the
first with α = 0.1 and γ
= 0.1, the second with α
= 0.9 and γ = 0.9.
(Be sure to select the original Series_G
variable in the active work area before you produce the summary plots.)


As expected, the smoothed values follow the observed values more closely
in the second graph shown above. However, looking at the forecasts it
is evident that in this model (without seasonality), the predicted values
simply consist of a straight line. Thus, using the Holt
two-parameter (Linear trend)
model, you would "miss" the significant seasonal increase of
airline passengers during the summer months.
Now look at the model that seems most appropriate here, that is, the
Winters three-parameter model
with Linear trend and Multiplicative
seasonality.
Triple Exponential
Smoothing: Winters' Method. In this method, a third parameter δ
(Delta) is added to the model
to smooth the multiplicative seasonal component. Again, if δ is 0
(zero), then a constant stable seasonal component is included in the computation
of the smoothed values and forecasts; if δ is set to 1, then the
seasonal component is recomputed from observation to observation. Shown
below are the summary plots for α = 0.1,
δ = 0.1, and γ = 0.1, and for α = 0.9,
δ = 0.9, and γ = 0.9.


In this case, there is hardly any difference between the two summary
plots. The reason for this is that the series indeed consists of a stable
linear trend, strong stable seasonal fluctuation, and only little random
fluctuation. Therefore, even though by setting δ and γ to 0.9, you "allow" the seasonal
and trend components to be modified substantially from observation to
observation, no such modification is required. In fact, the automatic
search for the best parameters discussed below will arrive at the same
conclusion.
Parameter Grid Search.
As discussed in the Introductory
Overview, in practice, when you want to compute forecasts, you are
best advised to estimate optimum smoothing parameters from the data (e.g.,
see Gardner, 1985). This can be done in two ways. One common method is
to perform a grid search of the parameter space. Thus, click on the Grid search tab. STATISTICA
will increment each parameter from the minimum (Start
parameter at) by the value specified in the Increment
by column, up to the value specified in the Stop
at column.
For each combination of parameter values, STATISTICA
will compute the Sums of Squares (SS) for the residuals
(observed values minus smoothed values). By default, when the Display
parameters for 10 smallest mean squares check box is selected,
then the "best" 10 solutions; that is, the combinations of parameters
that yield the smallest residual variability will be displayed in a spreadsheet.
Accept all defaults and examine that spreadsheet (see below) by clicking
the Perform grid search button.
(Be sure to select the original Series_G
variable in the active work area before you produce the spreadsheet.)
As suspected, the best-fitting models are those with parameter values
for δ and γ near 0 (zero), that is, models with constant stable
linear trend and seasonality.

Note that in addition to the Sums
of Squares and Mean Squares,
several other indices of goodness of fit are listed in this spreadsheet.
All of these are discussed in the Introductory
Overview; of particular interest is often the Mean
absolute percentage error (MAPE). This value expresses the average
(absolute) difference between the observed and smoothed (predicted) values
relative to the observed values. For example, for the first model with
α = 0.8, δ = 0.1, and γ = 0.1,
the MAPE value is 2.97.
This means that on average the smoothed (predicted) values computed by
this model only deviated 2.97% from the actual observed values.
Automatic Parameter
Search. The second way to determine the optimum parameters for
smoothing is to minimize the Sums of Squares of the residuals or some
similar index of goodness of fit. This can be done in the Time Series
module by using a general nonlinear function minimization algorithm (the
so-called quasi-Newton method; the same method used to estimate the ARIMA
parameters). Now, click on the Automatic search tab. This tab
contains several technical parameters that pertain to the quasi-Newton
method; those parameters are described in detail in the Automatic search tab help. The
Lack of fit indicator group shows
the different quantities that can be minimized.

As Parameter start values,
specify 0.1 for all three parameters
(it is always a good idea to start the minimization procedure with small
parameter values). By default, the Unconstrained
parameter estimation check box is selected. This means that in
the course of the function minimization, you may see parameter values
that become smaller than 0 (which is not permissible for the smoothing
parameters). However, before the final results are reported, the parameters
will automatically be set to the closest valid value, so you do not have
to change this setting (again, refer to the description of this option
in the Automatic search tab help, for
a more detailed discussion). Now click the Automatic
estimation button and after the
iterative parameter search procedure finishes, look at the summary graph
for the best parameter values.

Once again, as in the grid search, the best model is one that contains
a stable constant linear trend and seasonality. The remaining random variability
is best smoothed with a very "flexible"(large) α parameter
value (0.72) that allows the
smoothed values to follow closely the observed values.
Final Results.
Now return to the original goal, namely to compare the forecasts produced
by exponential smoothing with those from ARIMA. Lock the exponentially
smoothed values and then compute the ARIMA analysis as described in Example 2. After you complete
the ARIMA analysis, go to the Transformations
of Variables - Review & plot tab
and plot the exponentially smoothed series with the ARIMA forecasts by
clicking the Plot two var lists with
different scales button.

If you select the Display/plot subset
of cases only check box on the Review
& plot tab, you can specify
to plot the from 1 through 12
forecasts only.

Note that the default x-axis labels in this plot have been changed to
reflect the future (forecast) dates. Even though the two lines diverge
slightly, looking back at the full plot of all cases, that divergence
is relatively minor.
See also, Time Series Analysis Index.