Multiple Regression - Computational Approach

Select the Advanced tab of the Multiple Regression Results dialog box to access options to review more detailed results of the specified analyses. Use the options on the Residuals/assumptions/prediction tab to perform residual analysis and to make predictions for the dependent variable.

Summary: Regression results. Click the Summary: Regression results button to produce two spreadsheets. The Summary Statistics spreadsheet displays the summary statistics for the regression analysis (e.g., R, R-square, etc.). These values will be identical to the values displayed in the Summary box of the Multiple Regression Results dialog box (see Multiple Regression Results dialog for a detailed description of the statistics that are displayed).

The Regression Summary for Dependent Variable spreadsheet displays the standardized (Beta) and non-standardized (B) regression coefficients (weights), their standard error, and statistical significance. The Beta coefficients are the coefficients you would have obtained had you first standardized all of your variables to a mean of 0 and a standard deviation of 1. Thus, the magnitude of these Beta coefficients allow you to compare the relative contribution of each independent variable in the prediction of the dependent variable. The summary statistics for the regression analysis (e.g., R, R-square, etc.) will also be displayed in the headers of this spreadsheet.

ANOVA (Overall goodness of fit). Click the ANOVA (Overall goodness of fit) button to produce a spreadsheet with a complete Analysis of Variance table for the current regression equation (for more information see, ANOVA).

Covariance of coefficients. Click the Covariance of coefficients button to produce 1) a spreadsheet with the correlations of the regression coefficients and 2) a spreadsheet with the variances and covariances of the regression coefficients.

Current sweep matrix. Click the Current sweep matrix button to display a spreadsheet with the current sweep matrix. Matrix inversion in multiple regression is accomplished via sweeping. The sweep matrix of all independent variables that are currently in the regression equation is also -1 times the inverse of the correlation matrix of those variables (that is, the sign of each element in the matrix is reversed). The diagonal elements for those variables that are not in the equation can be interpreted as the (1 - R-square) values, treating the respective variable as the dependent variable, and using all current independent variables.

Note: variance inflation factor. The diagonal elements of the inverse correlation matrix (i.e., -1 times the diagonal elements of the sweep matrix displayed via this option) for variables that are in the equation are also sometimes called variance inflation factors (VIF; e.g., see Neter, Wasserman, Kutner, 1985). This terminology denotes the fact that the variances of the standardized regression coefficients can be computed as the product of the residual variance (for the correlation transformed model) times the respective diagonal elements of the inverse correlation matrix. If the predictor variables are uncorrelated, then the diagonal elements of the inverse correlation matrix are equal to 1.0; thus, for correlated predictors, these elements represent an "inflation factor" for the variance of the regression coefficients, due to the redundancy of the predictors.

Partial correlations. Click the Partial correlations button to display spreadsheets with:

The Beta in (standard regression coefficient for the respective variable if it were to enter into the regression equation as an independent variable);

The partial correlation (between the respective variable and the dependent variable, after controlling for all other independent variables in the equation);

The semi-partial (part) correlation (the correlation between the unadjusted dependent variable with the respective variable after controlling for all independent variables in the equation; matrices of partial correlations and semi-partial (or part) correlations can be computed in the General Linear Model (GLM) and General Regression Models (GRM) modules.);

The tolerance for the respective variable (defined as 1 minus the squared multiple correlation between the respective variable and all independent variables in the regression equation);

The R-square (between the current variable and all other variables in the regression equation);

The t-value associated with these statistics for the respective variable, and;

The statistical significance of the t-value.

These statistics will first be displayed separately for variables not currently in the regression equation, and for the variables in the regression equation (if any).

Redundancy. Click the Redundancy button to display a spreadsheet with various indicators of the redundancy of independent variables (currently included or not included in the equation). Specifically, for each variable, the spreadsheet will show 1) the tolerance (defined as 1 - R-square for the respective variable with all other variables currently in the equation), 2) the R-square (between the current variable and all other variables in the regression equation, 3) the partial correlation (between the respective variable and the dependent variable, after controlling for all other independent variables in the equation), and 4) the semi-partial (part) correlation (the correlation between the unadjusted dependent variable with the respective variable after controlling for all independent variables in the equation).

Stepwise regression summary. Click the Stepwise regression summary button to display a spreadsheet containing the summary of the stepwise regression. Note that this option is available only if 1) you have selected Forward stepwise or Backward stepwise regression on the Model Definition - Quick tab, or 2) if in the previous regression analysis the same dependent variable was analyzed and the independent variable list in the previous analysis was a subset of the current independent variable list or vice versa on the Model Definition dialog box. Thus, in the latter case, you may evaluate the change in R-square caused by removing or entering several variables in a single step (hierarchical analysis).

For example, if in the first analysis, variables 1 through 5 are selected as independent variables, and in the subsequent analysis with the same dependent variable you specify variables 1 through 3 as the independent variables, then choosing this option will display a spreadsheet with R-square, the R-square decrease, F to remove, and the number of variables removed in the single step (two in this example). If Forward stepwise or Backward stepwise regression is specified, the spreadsheet will contain the R-square increment (or decrease) at each step. Note that you must specify variables using the Model Definition dialog box; this option is not available if you only select variables on the Multiple Linear Regression Startup Panel.

ANOVA adjusted for mean. Click the ANOVA adjusted for mean button if you want STATISTICA to compute the ANOVA table, including the sums of squares and R-square value, based on the proportion of variability around the mean for the dependent variable, explained by the predictor variables. This option is available only if the intercept is Set to zero for the current regression model. In that case, you can compute the multiple R-square value either based on the variability around the origin (zero), or based on the variability around the mean. The default R-square value reported in the summary box pertains to the former, that is, it is the proportion of variability of the dependent variable around 0 (zero) that is accounted for by the predictor variables. If you click this button, however, it is based on the variability around the mean. For various alternative ways for computing the R-square value, refer to Kvalseth (1985).