SANN - Data Selection - Quick Tab

Select the Quick tab of the SANN - Data selection dialog box to access the options described here.

Variables. Click the Variables button to display the variable selection dialog box. The types and numbers of variables that you can select here depend on the Analysis type selected in the Startup Panel. For more information on when to select a specific Analysis type, see the Startup Panel topic. This button is not available in deployment mode (see the SANN - Analysis/Deployment Startup Panel topic) since the selection of the analysis variables is completely determined by the PMML code.

Regression. If you have selected Regression, you will be able to select Continuous targets and Continuous and/or Categorical inputs.

Classification. If you have selected Classification, you will be able to select one Categorical target and Continuous and/or Categorical inputs.

Time series (regression). If you have selected Time series (regression), you will be able to select Continuous targets and Continuous and/or Categorical inputs. Note that you should only perform time series analysis when your data involve lagged (over time) predictions.

Time series (classification). If you have selected Time series (classification), you will be able to select one Categorical target and Continuous and/or Categorical inputs. Note that you should only perform time series analysis when your data involve lagged (over time) predictions.

Note: For time series analysis, selecting input variables is optional. In that case a STATISTICA SANN time series analysis (whether regression or classification) will map future values of the target variable on its past measurements. In other words SANN will predict future values of the target variable from its past.

Cluster Analysis. If you have selected Cluster Analysis, you will be able to select Continuous inputs (predictor) and Categorical inputs (predictor). Target variables are not used in Clustering problems.

Analysis variables (present in the dataset). This group box displays the variables currently selected for the analysis.

Strategy for creating predictive models. SANN provides neural network building strategies that can be used in generating your models: Automated Network Search (ANS), Custom Neural Networks (CNN), and Subsampling (random, bootstrap). The options available to you during training will depend on the selection you make here.

Automated network search (ANS). Select this option button to enable the Automated Network Search (ANS), which is used for creating neural networks with various settings and configurations with minimal effort. ANS helps you create and test neural networks for your data analysis and prediction problems. It designs a number of networks to solve the problem and then selects those networks that best represent the relationship between the input and target variables. Note that the ANS strategy is not available for clustering analysis. ANS is also not applicable when deploying models since no training of neural networks is required.

Custom neural networks (CNN). Select this option button to use the Custom neural networks (CNN) strategy. In contrast to ANS, the Custom Neural Networks (CNN) tool enables you to choose individual network architectures and training algorithms to exact specifications. You can use CNN to train multiple neural network models with exactly the same design specifications but with different random initialization of weights. As a result, each network will find one of the possible solutions posed by neural networks of the same architecture and configurations.

Subsampling (random, bootstrap). Select this option button to use the Subsampling (random, bootstrap) strategy. This tool enables you to create an ensemble of neural networks based upon multiple subsamples of the original data set. Options for the Subsampling (random, bootstrap) strategy are available on the Subsampling tab.

Note: When the Results dialog box is displayed, you can switch between ANS, CNN, and Subsampling strategies. Therefore, all sampling/subsampling specifications should be configured at this stage of your analysis. Use the Sampling (CNN and ANS) tab to configure the train, test, and validation samples, which should be used for training, testing, and the validation of models using ANS and CNN. Similarly, you can use the Subsampling tab to define the train, test, and validation samples, which will be used to train neural networks models using the Subsampling strategy.