GLZ Quick Specs - Advanced Tab

Select the Advanced tab of the GLZ Quick Specs dialog box to access the options described here. For alternative ways of specifying designs in GLZ, see Methods for Specifying Designs.

Note: During the forward step of stepwise model building, if two or more effects have p-values that are so small as to be virtually indistinguishable from 0, STATISTICA will select the effect with the largest score statistic if the degrees of freedom for all effects in question are equal. If the effects differ with respect to the degrees of freedom, the score statistics are normalized using the Wilson-Hilferty transformation, and the effect with the largest transformed value is entered into the model. For the backward step, if the p-values for two or more effects are virtually indistinguishable from 1, STATISTICA will remove the effect with the smallest Wald statistic in the case of equal degrees of freedom and the smallest normalized value in the case of unequal degrees of freedom.

Parametrization.

Sigma-restricted. Select the Sigma-restricted option button to compute the design matrix for categorical predictors in the model based on sigma-restricted coding. The sigma-restricted model is the default parameterization, except for models that involve nested effects; see the GLM Introductory Overview - Sigma-Restricted and Overparameterized Model for additional details.

Overparameterized. Select this option button to use the overparameterized model.

Ref. Select the Ref. option button to compute the design matrix for categorical predictors in the model based on reference coding. Parameter estimates of reference coded categorical predictors estimate the difference of effect between the specific level and the reference level. For example, a  categorical predictor with three levels will be coded as:

                Design matrix columns

Level                  1             2   

   A                    1             0

   B                    0             1

   C                    0             0

Where C is the reference level.

Set reference level. Click this button to display a dialog box where you can specify the reference category of each categorical factor.

Estimation. The options in the Estimation group box control various technical details of the iterative estimation procedure.

No intercept. Select the No intercept check box to exclude the intercept from the model. This option is not available, and no intercept is the default, for mixture models (see also Experimental Design for a discussion of designs for mixtures).

User-def. start values. Select the User-def. start values check box and then click OK in the GLZ Quick Specs dialog box to display the Select Start Values dialog box, in which you can specify start values for the iterative estimation procedure, for each parameter.

Sweep delta. Enter the negative exponent for a base-10 constant delta (delta = 10-sdelta) in the Sweep delta field; the default value is 7. Delta is used in sweeping to detect redundant columns in the design matrix.

Max. iterations. Enter the maximum number of iterations for the iterative estimation procedure in the Max. iterations field.

Converge. Enter the convergence criterion value in the Converge field. This value will be used to determine whether the iterative estimation procedure has converged; specifically, the integer value entered into this field is used as the (negative) exponent of a base 10 constant (e.g., if the default value 7 is used, the constant will evaluate to 10E-7); this constant is then used to check for convergence of the  iterative estimation procedure by comparing it to the absolute value of the difference of the log-likelihood function between two successive iterations.

Offset. Click the Offset button to display the standard variable selection dialog box, which contains options to specify a variable that contains offset values (constants) that are to be applied to the response variable. The offset variable can be considered a predictor variable with a known coefficient (parameter) value of 1 (see McCullagh and Nelder, 1989). An offset is commonly used in log-rate problems, therefore requiring the log of the offset variable for use in the model. In this case, you will need to compute the log of the offset variable explicitly. STATISTICA does not take the log of the offset automatically.

Model building. Use the options in the Model building group box to specify the model building method that you want to use. If one of the forward/backward model building methods is used, STATISTICA will perform the predictor selection process and proceed to perform the analysis on the model identified as the best model by the model building process. Note: For Beta regression models, Forward stepwise, Backward stepwise, Forward entry, Backward removal, and Best subsets with Likelihood score will not be available for model building.

All effects. Select the All effects option button to enter all effects specified in the current design (see Between Effects) into the regression equation.

Forward stepwise, Backward stepwise, Forward entry, Backward removal. Select these option buttons to perform stepwise selection of predictor variables and effects. Forward selection will cause variables to be moved into the model, backward selection will start with a model with all predictor variables and effects in the model, which are then removed. The Forward entry and Backward removal options will only allow for variables or effects to be entered or to be removed, respectively, depending on the chosen method (forward or backward). The Forward stepwise and Backward stepwise options will at each step cause STATISTICA to consider simultaneously the addition or removal of a variable or effect, based on the current specifications of the p1, enter or p2, remove fields. See the p1, enter, p2, remove, and Max steps options below for additional details.

For example, if Forward stepwise is selected, STATISTICA will at each step consider both a step "forward", i.e., entry of another variable or effect into the model (based on the p1, enter ), and a step "backward", i.e., removal of a previously entered variable or effect from the model (based on the p2, remove). The reason the Forward stepwise method usually adds rather than removes variables or effects (i.e., the reason why it is a forward selection method) is because of the required setting of the p1, enter and p2, remove values, which have to be specified so that p1, enter is smaller than the p2, remove, thus guaranteeing that significant predictor variables or effects are entered into the model, and not removed. Most of the widely used algorithms for stepwise selection use the Forward stepwise and Backward stepwise methods.

Best subsets. Select the Best subsets option button to perform a search of all possible subsets of effects specified in the current design (see Between Effects). When this option button is selected, various additional options will become available for steering the search for the best subset; see the Max. subsets, Likelihood score, Likelihood, and Akaike IC options below for details. As discussed in the Introductory Overview, the total number of all possible subsets (that need to be reviewed by STATISTICA) can become excessively large when there are many effects in the model and many large subset sizes are being considered.

Note: p1, enter / p2, remove / Max. steps options. These options are only available when either Forward stepwise (effects can be entered or removed), Backward stepwise (effects can be removed or entered), Forward entry (effects can only be entered, and never be removed), or Backward removal (effects can only be removed, and once removed, never be re-entered into the model) is selected in the Model building group box. These options enable you to steer the stepwise selection procedure; for a description of stepwise model building procedures, refer to the Introductory Overview.

p1, enter / p2, remove. The stepwise entry or removal of effects into or out of the model is guided by the significance levels (p-values) specified in the p1, enter and p2, remove text boxes. In Forward stepwise selection, the score statistic is used to select new (significant) effects; while the Wald statistic is used during backward steps (i.e., when effects are selected for removal from the model). Specifically, an effect will be entered into the model if the statistical significance of its contribution to the prediction is better than (i.e., p less than) p1, enter; an effect will be removed from the model if the statistical significance of its contribution is worse than (i.e., p greater than) p2, remove. Thus, in Forward stepwise and Backward stepwise selection, where at each step effects can be entered into or removed from the model, p1, enter must be less than p2, remove, so that effects that are entered are not automatically removed in the next step, or vice versa. The p2, remove value is ignored when Forward entry is selected; the p1, enter value is ignored when Backward removal is selected.

Max. steps. In this text box, enter the maximum number of steps to be performed in the stepwise selection of effects.

Note: Max. subsets / Likelihood score / Likelihood / Akaike IC options. These options are only available when Best subsets regression is selected in the Model building group box. Note that, as discussed in the Introductory Overview, the total number of all possible subsets (that need to be reviewed by STATISTICA) can become excessively large when there are many effects in the model.

Max. subsets. In this text box, enter the value to determine the number of subsets that will be displayed in the GLZ Results dialog box. For example, if you specify 10 in this field, then in the Model building spreadsheet (available via the Model building button on the GLZ Results - Summary tab) you can later review the 10 best subsets according to the chosen criterion (see the Likelihood score, Likelihood, and Akaike options below).

Likelihood score / Likelihood / Akaike IC. The best subset search method can be based on three different test statistics. Select the Likelihood score option button to use the score statistic. Select the Likelihood option button to base use the overall model likelihood. Finally, select the Akaike IC option button to use the Akaike information criterion (AIC). Since the evaluation of the score statistic does not require iterative computations, best subset selection based on the score statistic is computationally faster, while the selection based on the other two statistics usually provides more accurate results.

Cross-validation. Click the Cross-validation button to display the Cross-Validation dialog box for specifying a categorical variable and a (code) value to identify observations that should be included in the computations for fitting the model (the analysis sample); all other observations with valid data for all predictor variables and the dependent (response) variable will automatically be classified as belonging to the validation sample (see the Results dialog box topic for a description of the available residual statistics for observations in the validation sample).

See also, GLZ - Index.