General best-subset and stepwise discriminant analysis; builds a linear
discriminant function model for continuous and categorical predictor variables,
using ANCOVA-like designs. The parameters in Statistica allow
full access to the GDA syntax for specifying ANCOVA-like models, and for
controlling the parameters for stepwise and best-subset selection of predictor
effects (for categorical and continuous predictor variables). Note that
the algorithm for stepwise and best subset selection of categorical factor
effects ensures that complete (possibly multiple-degrees-of-freedom) effects
are moved into and out of the model.

The General Discriminant Analysis module provides functionality that makes
this technique a general tool for classification and data mining. However,
most - if not all - textbook treatments of discriminant function analysis
are limited to simple and stepwise analyses with single degree of freedom
continuous predictors. No 'experience'(in the literature) exists regarding
issues of robustness and effectiveness of these techniques, when they
are generalized in the manner provided in this very powerful module. The
use of best-subset methods, in particular when used in conjunction with
categorical predictors, should be considered a heuristic search method,
rather than a statistical analysis technique.

**General**

**Detail of computed results reported**. Detail of computed results;
if Minimal level of detail is requested, the output contains Chi-square
tests of roots, discriminant (canonical) function coefficients, factor
structure coefficients, and classification function coefficients. If All
results is requested, Statistica will also report various
descriptive statistics and classification summary statistics. Classification
statistics for each case can be requested separately as an option.

**Analysis syntax**. Analysis syntax string for General Discriminant
Function Analysis (GDA) models; you can specify here the complete syntax,
as, for example, copied from a Statistica analysis. Set this
string to empty, or just GDA; to create the syntax from the specific options
selected below.

**Design**.
Required; specify the design for the between group (ANCOVA-like) design
(categorical and continuous predictors); default is NONE.

Use the syntax:

DESIGN = Design specifications

Example 1.

DESIGN = GROUP | GENDER | TIME | PAID; {makes a full factorial design}

Example 2.

DESIGN = SEQUENCE +

Example 3.

DESIGN = MULLET |

Example 4.

DESIGN =

Example 5.

DESIGN = BLOCK + DEGREES +

**Model building
method**. Specifies a model building method.

**Priors**.
Set the prior classification probabilities for classifying observations.
The default specification is Estimated; use this option to set the prior
classification probabilities proportional to the observed group (class)
N's; use the Equal option to assign equal probabilities to each group
or class specified in the categorical dependent variable.

**Case statistics**.
Creates and reports selected case statistics.

**Sweep delta
1.E-**. Specifies the negative exponent for a base-10 constant
Delta (delta = 10^-sdelta); the default value is 7. Delta is used (1)
in sweeping, to detect redundant columns in the design matrix, and (2)
for evaluating the estimability of hypotheses; specifically a value of
2*delta is used for the estimability check.

**Inverse
delta 1.E-**. Specifies the negative exponent for a base-10 constant
Delta (delta = 10^-idelta); the default value is 12. Delta is used to
check for matrix singularity in matrix inversion calculations.

**Generates
data source, if N for input less than**. Generates a data source
for further analyses with other Data Miner nodes if the input data source
has fewer than k observations, as specified in this edit field; note that
parameter k (number of observations) will be evaluated against the number
of observations in the input data source, not the number of valid or selected
observations.

**Parameters
for stepwise selection**

**Stepwise
selection criterion**. Specifies the criterion to use for stepwise
selection of predictors. Note that the F statistic (criterion) is only
available for analysis problems with continuous (single degree of freedom)
predictors; for ANCOVA-like designs with factor effects for categorical
predictors, only the Probability criterion is applicable.

**p to enter**.
Specifies

**p to remove**.
Specifies

**F to enter**.
Specifies

**F to remove**.
Specifies

**Maximum
number of steps**. Specifies maximum number of steps for stepwise
selection of variables.

**Parameters
for best-subset selection**

**Best subsets measure**. Specifies the
selection criterion for best subset selection of predictors. To use cross-validation
misclassification rates, a cross-validation variable (learning sample)
must be specified.

**Start for
best subsets**. Specifies the smallest number of predictors to
be included in the model chosen via best subset selection, i.e., the start
of the search for the best subset of predictors.

**Stop for
best subsets**. Specifies the maximum number of predictors to
be included in the model chosen via best subset selection.

**Number of subsets to display**. Specifies the number of subsets
to display in the results; Statistica will keep a log of
the best k predictor models of any given size, using k as specified by
this parameter.

**Number of variables to force**. Specifies the number of predictors
to force into the model, i.e., to select into all models considered during
the best-subset selection of predictors. Statistica will
force the first k predictors in the list of continuous predictors into
the model, with k as specified here by you.

**Deployment. **Deployment is
available if the Statistica installation is licensed for this feature.

**Generates
C/C++ code**. Generates C/C++ code for deployment of predictive
model.

**Generates SVB code**. Generates Statistica* *Visual
Basic code for deployment of predictive model.

**Generates
PMML code**. Generates PMML (Predictive Models Markup Language)
code for deployment of predictive model. This code can be used via the
Rapid Deployment options to efficiently compute predictions for (score)
large data sets.

**Saves C/C++
code**. Save C/C++ code for deployment of predictive model

**File name
for C/C code**. Specify the name and location of the file where
to save the (C/C++) deployment code information.

**Saves SVB code**. Save Statistica Visual Basic code
for deployment of predictive model

**File name
for SVB code**. Specify the name and location of the file where
to save the (SVB/VB) deployment code information.

**Saves PMML
code**. Saves PMML (Predictive Models Markup Language) code for
deployment of predictive model. This code can be used via the Rapid Deployment
options to efficiently compute predictions for (score) large data sets.

**File name
for PMML (XML) code**. Specify the name and location of the file
where to save the (PMML/XML) deployment code information.