GC&RT Results - Classification Tab

Select the Classification tab of the GC&RT Results dialog box to access options for reviewing plots and spreadsheets of observed and predicted values for each observation. The tab also provides information about the prior probabilities used in the analysis.

Sample. Select an option button in the Sample group box to specify for which type of sample to compute the predicted and residual statistics (classifications). If these options are not available, no valid cases were found in the respective type of sample.

Analysis. Select this option button to display and plot predicted and residual values for all observations that were used to compute the current results.

Test set. Select this option button to display and plot all observations that were not used to compute the current results, but have valid data for all predictor and dependent variables.

Prediction. Select this option button to display and plot all cases that have valid data for the predictor variables, but missing data for the dependent variable.

Surrogate. Select this option button to display and plot only those cases with missing data for at least one predictor variable used for splitting, if those cases were included in the analyses by using surrogate splits (see, for example, the description of the Number of surrogates option on the Quick specs dialog box - Advanced tab).

Predicted vs. observed by classes. This button is only available for classification-type problems. Click the Predicted vs. observed by classes button to produce a spreadsheet and a 3-D histogram of the predicted by observed classification frequencies.  

Priors. Click the Priors button to display the spreadsheet containing the prior probabilities and the corresponding n for each class (group) in the dependent variable.

Adjusted priors. Click this button to produce a spreadsheet containing a priori probabilities for each class of the dependent variable, adjusted for the User-specified misclassification costs and the corresponding class ns. Note that this option is available only when the User-specified misclassification costs option is used in the analysis.

Misclassification cost matrix. Click the Misclassification cost matrix button to display the spreadsheet containing the user-specified costs of misclassifying cases or objects in each observed class of the dependent variable (columns) as another class (rows; all cost values will be 1 by default, i.e., if not altered by the user).

Terminal nodes. Click this button to produce a spreadsheet containing summary information for the terminal nodes only.

For classification problems (categorical dependent variable), the spreadsheet shows the number of cases or objects in each observed class that are sent to the node; a Gain value is also reported. By default (with Profit equal to 1.0 for each dependent variable class), the gain value is simply the total number of observations (cases) in the respective node. If separate Profit values are specified for each dependent variable class, then the Gain value is computed as the total profit (number of cases times respective profit values).

For regression problems (continuous dependent variable), the spreadsheet shows the number of cases or objects in each observed class that are sent to the node, and the respective node mean and variance.

Options for gain. Gain is a measure of importance of a node that helps identify desired terminal nodes. Gain is calculated as the number or percent of cases in the selected category or as the average profit for each terminal node. The options here can be used to determine how the gain should be computed for each node. Note that these options are available only when performing classification analyses.  

Average profit. Select this option button to compute the gain as the weighted average of the number of cases in each category for the terminal node. All weights are 1 by default, but can be specified via the Profit option (see below).

Profit. Click this button to display the Enter values of profit for the categories of dialog box. In this dialog box you can specify profit values (weights) to be used when computing the gain for each node. This button is only active if the Average profit option has been selected. By default, the profit values for all categories are equal to 1.

Percent of cases. Select this option button if you want the gain to be computed as percent of cases of the response variable, for a selected category.

Category. Select the category on which to base the percent-of-cases computations from this list. This option becomes available when Percent of cases is selected.