Rapid Deployment of Predictive Models

Ribbon bar. Select the Data Mining tab. In the Deployment group, click Rapid Deployment to display the Rapid Deployment of Predictive Models dialog box.

Classic menus. On the Data Mining menu, select Rapid Deployment of Predictive Models (PMML) to display the Rapid Deployment of Predictive Models dialog box.

This dialog box contains five tabs: Quick, Lift chart, Save results, Profit Chart/ROC Curve, and Confusion matrix. The Lift chart options are applicable only for predictive classification when observed classifications are available.

Variables. Click the Variables button to display a four-variable list selection dialog box. Select either a continuous or categorical dependent (outcome) variable, and select continuous and/or categorical predictor variables. This designation of variable lists is also explained in greater detail in Getting Started with Data Mining. The Variables option is available only if the variables for the analysis are not taken directly from the input PMML files. In that case, the variable selections here will be "mapped" to the respective models specified in the PMML files in the order in which they are selected, regardless of whether or not the variable names of the selected variables match those specified in the PMML input files. Hence, by selecting variables "manually," you can compute predictions from data even if the variable names are different from those used when the respective models were trained.

Note: Codes (classes) for categorical variables. Unlike in some other modules of STATISTICA, the Rapid Deployment of Predictive Models module will attempt to match the text labels for the codes or classes of categorical (predictor or dependent) variables to those found in the respective variables in the data file. Therefore, when using categorical predictors or dependent variables (with observed classifications), ensure that any text labels match those in the respective PMML model files, i.e., those that were used when training the respective models.

Variable selection via PMML. Select this check box to use the variables specified in the respective PMML model files; in that case the program will attempt to match the names found in the current active data file with those specified in the respective PMML files (matching by name). Clear this check box to manually select variables for which to compute predictions or predicted classifications; in this case, the variable selections here will be "mapped" to the respective models specified in the PMML files in the order in which they are selected, regardless of whether or not the variable names of the selected variables match those specified in the PMML input files.

Load models from disk. Click this button to display the Open PMML files dialog box, where you can browse to and open the PMML input model files. More than one model file can be loaded, and in this case, all models will be applied to the data specified as the current active data set.

Load models from Enterprise. Click this button to display the Select Deployed Model dialog box, where you can browse to and open deployed models from Statistica Enterprise.

Summary. Click this button to compute predicted values or predicted classifications and other statistics for the current model from the current active data set. For regression-type problems, when observed values for the outcome variable exist, Statistica will also compute the average squared error (residual) for each prediction model; for classification-type problems, when observed values for the outcome variable exist, the program will also compute overall error rates for each prediction model.

Cancel. Click the Cancel button to close the Rapid Deployment of Predictive Models dialog box without reviewing any other results.

Options. See Options Menu for descriptions of the commands on this menu.

Model(s). Click this button to display the Model(s) dialog box, where the loaded model(s) are displayed and you can rename the model(s) if desired)

Open Data. Click the Open Data button to display the Select Data Source dialog box, which contains options to choose the spreadsheet on which to perform the analysis. The Select Data Source dialog box contains a list of the spreadsheets that are currently active.

Select Cases. Click the Select Cases button to display the Analysis/Graph Case Selection Conditions dialog box, which contains options to create conditions for which cases will be included (or excluded) in the current analysis. More information is available in the case selection conditions overview, syntax summary, and dialog box description.

STATISTICA Enterprise. You can create automated analyses (analysis workflows) that can be accessed by others in your enterprise (based on their roles and permissions) to automatically apply the model and compute model predictions (perform "scoring") against new data. Because analysis configurations can be run automatically using the Statistica Enterprise platform, these options enable you to create advanced automated (model-based) process monitoring and (predictive) SPC analyses, or to automatically "score" new data based on a specific data mining model.

Deploy. Click the Deploy button to associate the model (PMML) code with an (SVB) analysis configuration in Statistica Enterprise. This option is only available if Statistica Enterprise is installed and if you have permissions to the respective data configurations and to create analysis configurations.