SANN - Results - Predictions Tab

Neural Networks

Select the Predictions tab of the SANN - Results dialog box to access the options described here. For information on the options that are common to all tabs, see SANN - Results.

Predictions spreadsheet.

Predictions type. Select the set of predictions to be included in the prediction spreadsheet. Note that ensemble options are only available if more than one network is displayed in the Active neural network grid.

Standalones. Select this option button to include the predictions for the individual networks shown in the Active neural network grid.

Ensemble. Select this option button to include the predictions for an ensemble of all networks that are displayed in the Active neural network grid.

Standalones and ensemble. Select this option button to include predictions for the individual networks and an ensemble of all networks.

Predictions. Click this button to generate a spreadsheet of model predictions. The precise details shown are controlled by the options selected in the Predictions type, Include and Samples group boxes. Generally speaking, when creating spreadsheets SANN always tries to present the results in as much of a compact form as possible. For example, when you create a predictions spreadsheet for a number of active networks, SANN will normally include the predictions in one single spreadsheet. However, this is only possible when the networks use the same train/test/validation samples, which is always the case for Automatic Network Search (ANS) and Custom Neural Networks (CNN) network building strategies. However, since Subsampling uses different permutations of the data set to create train/test/validation samples for each and every individual network, what constitutes a training case for one network may be a test or validation case (or none of the above) for another. In this case, it is no longer possible to consistently present the network outputs in one single spreadsheet. Therefore, whenever your active networks are created using different samples, SANN creates a predictions spreadsheet for each individual network plus one more for the ensemble (should you request ensemble outputs). Otherwise, all network and ensemble predictions are placed in one spreadsheet.

Include. Use the options in this group box to (optionally) include additional information in the predictions spreadsheet. Available options are dependent on the analysis type and network type.

Inputs. Select this check box to include the input variables in the Predictions spreadsheet.

Targets. Select this check box to include the target variables in the Predictions spreadsheet.

Output. Select this check box to include the actual predictions in the Predictions spreadsheet.

Residuals. Select this check box to include the residual values in the Predictions spreadsheet. Available only for regression type analyses, residual is calculated as observed - predicted (i.e., target - output).

Accuracy. Available for classifications type analyses; indicates [Correct/Incorrect] if the predicted category [Target] matches the actual one [Output].

Confidence levels. Select this check box to include the confidence level associated with the selected (or predicted) level of the target. This option only applies to classification type analyses.

Absolute res. Select this check box to include the absolute value of the residual in the Predictions spreadsheet, i.e., |observed-predicted|. This option only applies to regression type analyses.

Square res. Select this check box to include the squared residual value in the Predictions spreadsheet (observed-predicted)2. This option only applies to regression type analyses.

Standard res. Select this check box to include the standardized residuals in the Predictions spreadsheet. Standardized residuals are calculated as (observed-prediction)/std. error of the prediction. This option only applies to regression type analyses.

Variables. Select this check box to include an arbitrary selection of variables from the data set in the Predictions spreadsheet. If you click the Predictions button when this check box is selected, a variable selection dialog will be displayed, enabling you to select the variables to include in the spreadsheet. Note that this option is only available when there are variables in the data set that have not been used as either inputs or targets.

Note: STATISTICA Neural Networks (SANN) solves the problem of classification by assigning true probabilities to class memberships to instances of the input (independent) variables. An input vector (data case) can assume membership of one of the classes found in the target (dependent) variable. For example, in a binary classification task (a classification problem with a target variable having two categorical levels), if the class probabilities for a given input are 0.6 and 0.4, pertaining to classes A and B, respectively, then the input will be assigned to category A (since A has the highest probability). However, it may happen sometimes that the class probability memberships are equal. In this case no classification is possible. This problem, however, occurs mainly in binary-classification (as opposed to multi-classification) tasks and only when the network is poorly trained or if the data set has no clear cut or well defined boundaries between its clusters. When such circumstances occur, the predictions spreadsheet will display "unknown" in the appropriate cell of the predictions spreadsheet and will be highlighted in red.