FSL Results (Feature Selection and Variable Screening Results)

Crucial Concepts in Data Mining

Click the OK button in the Feature Selection and Variable Screening dialog box to run the computations and display the FSL Results dialog box, which contains one tab, Quick.

Summary. Click the Summary button to display a spreadsheet with the best or most important predictors for each dependent variable, using the criterion of importance specified in the Criterion for selecting predictors group box (see description below).

Cancel. Click the Cancel button to return to the Feature Selection and Variable Screening dialog box.

Options. Click the Options button to display the Options menu.

By Group. Click the By Group button to display the By Group specification dialog box.

Quick Tab

Criterion for selecting predictors. Use the options in this group box to specify the criterion for selecting predictors for each dependent variable. See also Computational Details for a discussion of the criteria used for variable screening; see also the Feature Selection and Variable Screening Overview, in particular the section on Capitalizing on Chance, for additional details.

Display _ best predictors. Select this option button to display the best k predictors; for regression-type problems (for continuous dependent variables), the k predictors with the largest F values will be chosen, for classification-type problems the k predictors with the largest Chi-square values will be chosen.

Display best predictors with p <. Select this option button to display the list of best predictors for which the p value is less than the value specified in the adjacent edit field. The list of predictors will be sorted in ascending order by p. For regression-type problems (for continuous dependent variables), the p will be computed from the respective F values, for classification-type problems p will be computed from the respective Chi-square values. Choose this option when the predictor list consists of continuous and categorical variables with different numbers of classes (groups); the F values or Chi-square values in that case will be computed for different degrees of freedom, and not easily comparable). On the other hand, use this option with caution if there are a large number of observations in the input file, in which case the p value for many F or Chi-square statistics can become de-facto 0 (zero). In that case, the predictor variables cannot be unambiguously sorted by p (many of them will be almost 0), and the respective ordering of predictors (or predictors that are selected for further analyses) may not be unique.

Display _ best predictors sorted by. Select this option button in order to select the k best predictors based on the probability (p) criterion. As with the Display best predictors with p< criterion, this option is useful when the predictor list consists of continuous and categorical variables with different numbers of classes (groups, making the respective F or Chi-square statistics for different predictors incompatible). Via this option, you can still extract a fixed number of predictors, but using the probability values (p) as the selection criterion.

Summary: Best k predictors (features). Click this button to display a spreadsheet with the best or most important predictors for each dependent variable, using the criterion of importance specified in the Criterion for selecting predictors group box.

Histogram of importance for best k predictors. Click this button to display a histogram of the best or most important predictors for each dependent variable, using the criterion of importance specified in the Criterion for selecting predictors group box. Note that the value of Importance in this histogram (plotted along the vertical y axis) will always be the respective F or Chi-square values, regardless of the chosen Criterion for selecting predictors.

Report of best k predictors (features). Click this button to display a Statistica Report window with the selected variable numbers; you can copy the list of variable numbers and paste them directly into other variable selection dialogs, for example, to select variables for subsequent analyses.

Save predictors to variable bundle. Click this button to save the selected predictor variables to the input spreadsheet as a variable bundle. A dialog box will be displayed confirming that the predictors have been saved as a bundle. Click the OK button in the dialog box to proceed.