Data Mining - Goodness of Fit, Prediction, Classification

Ribbon bar. Select the Data Mining tab. In the Deployment group, click Goodness of Fit...

Classic menus. From the Data Mining menu, select Goodness of Fit, Classification, Prediction...

...to display the Goodness of Fit, Prediction, Classification Startup Panel. The STATISTICA Goodness of Fit module will compute various goodness of fit statistics for continuous and categorical response variables (for regression and classification problems). This module is specifically designed for data mining applications to be included in "competitive evaluation of models" projects as a tool to choose the best solution.

The program uses as input the predicted values or classifications as computed from any of the STATISTICA modules for regression and classification, and computes a wide selection fit statistics as well as graphical summaries for each fitted response or classification. Goodness of fit statistics for continuous responses include least squares deviation (LSD), average deviation, relative squared error, relative absolute error, and the correlation coefficient. For classification problems (for categorical response variables), the program will compute Chi-square, G-square (maximum likelihood chi-square), percent disagreement (misclassification rate), quadratic loss, and information loss statistics.

See also, Data Mining with STATISTICA Data Miner.