Classification Trees Startup Panel - Stopping Options Tab

Select the Stopping options tab of the Classification Trees Startup Panel to access options to select the stopping rule and stopping parameters for the analysis.

Stopping rule. In computing classification trees, the stopping rule determines the procedures used for selecting the right-sized classification tree.

Prune on misclassification error. Select this option button to prune on the misclassification error. Pruning methods are somewhat similar to the backward stepwise elimination method available in the Discriminant Analysis module. That is, sets of "branches" are successively "pruned" from the complete classification tree, just as sets of predictors are successively eliminated from the prediction equations when backward stepwise elimination is used in Discriminant Analysis. The "right-sized" classification tree is then selected from the "pruned" trees using the specified SE rule.

Prune on deviance. Select this option button to prune on deviance. (See Ripley, Breiman, et al.)

FACT-style direct stopping. Select the FACT-style direct stopping option button to use a quite different approach. In this approach, the complete classification tree including all splits is considered to be the "right-sized" tree. You can control when split selection stops with the value specified in the Fraction of objects box (see below). Splitting on the predictor variables continues until each terminal node in the classification tree is "pure" (i.e., has no misclassified cases or objects) or has no more than the minimum number of cases computed from the specified fraction of objects for one or more classes.

Stopping parameters. Use these options to control when split selection stops, and, if a pruning method is selected as a Stopping rule (see above), when pruning begins and which pruned tree is selected as the "right-sized" tree.

Minimum n. If a pruning method is selected in the Stopping rule group box, i.e., the Prune on misclassification error or Prune on deviance option button is selected (see above), enter a value in the Minimum n box to control when split selection stops and pruning begins. Splitting on the predictor variables continues until all terminal nodes in the classification tree are "pure" (i.e., have no misclassifications) or have no more than the specified minimum number of cases. Pruning of the tree also begins when this criteria is met for all terminal nodes.

Standard error rule. If a pruning method is selected in the Stopping rule group box, i.e., the Prune on misclassification error or Prune on deviance option button is selected (see above), the value entered in the Standard error rule box is used in the selection of the "right-sized" classification tree from the sequence of pruned trees.

The Standard error rule is applied as follows. Find the pruned tree in the tree sequence with the smallest CV cost. Call this value Min. CV, and call the standard error of the CV cost for this tree Min. Standard error. Then select as the "right-sized" tree the pruned tree in the tree sequence with the fewest terminal nodes that has a CV cost no greater than Min. CV plus the Standard error rule times Min. Standard error. A smaller (closer to zero) value for the Standard error rule generally results in the selection of a "right-sized" tree that is only slightly "simpler" (in terms of the number of terminal nodes) than the minimum CV cost tree. A larger (much greater than zero) value for the Standard error rule generally results in the selection of a "right-sized" tree that is much "simpler" (in terms of the number of terminal nodes) than the minimum CV cost tree. Thus, cost/complexity pruning, as implemented in the selection of the right-sized tree, makes use of the basic scientific principles of parsimony and replication: Choose as the best theory the simplest theory (i.e., the pruned tree with the fewest terminal nodes) that is consistent with (i.e., has a CV cost no greater than Min. CV plus Standard error rule times Min. SE ) the theory best supported by independent tests (i.e., the pruned tree with the smallest CV cost).

Fraction of objects. If FACT-style direct stopping is selected as the Stopping rule (see above), the value in the Fraction of objects box is used to control the classification tree selected as the "right-sized" tree. For FACT-style direct stopping, splitting on the predictor variables continues until all terminal nodes in the classification tree are "pure" (i.e., have no misclassifications) or have no more than the minimum number of cases or objects computed from the specified fraction of objects for the predicted class for the node. Click the Class minimum objects button on the Classification Trees Results dialog box - Predicted Classes tab to display a spreadsheet containing the minimum number of cases or objects for each class on the dependent variable computed from this fraction.