# Residual Analysis - Scatterplots Tab

Multiple Regression - Computational Approach

Select the Scatterplots tab of the Residual Analysis dialog to access options to review standardized scatterplots of predicted and residual values. All residual plots and spreadsheets are unlimited/user-limited as defined by the Maximum number of rows (cases) in a single results Spreadsheet or Graph box on the Residual Analysis dialog - Advanced tab. Note that when Pairwise deletion is selected in the MD deletion group box in the Startup Panel, the program will substitute any missing values with the respective means in the computations for predicted and residual values (and related statistics).

Predicted vs. residuals. Click the Predicted vs. residuals button to display a scatterplot of the raw predicted values (on the x-axis) versus the raw residuals (on the y-axis). This plot is very useful for testing the assumption of linearity regarding the relationship between the independent variables and dependent variable. Specifically, if the relationship is linear, then the residual scores can be expected to form a homogeneous "cloud" around the center line. However, if non-linearity is present, then peculiar patterns may emerge. For example, if the true relationship between variables is curvilinear rather than linear, the residuals may form an inverted U around the center line, indicating that predictions are consistently too high on the extreme ends of the scale, but too low in the central region of the scale. In that case, one may try to include second or third order polynomial transformations of the original independent variables in the regression (that is, X2, X3, etc.). This can easily be accomplished via the Fixed Nonlinear Regression module.

Predicted vs. squared residuals. Click the Predicted vs. squared residuals button to display a scatterplot of the raw predicted values (on the x-axis) versus the squared raw residuals (on the y-axis).

Predicted vs. observed. Click the Predicted vs. observed button to plot a scatterplot of the predicted versus the observed values. This plot is particularly useful for identifying potential clusters of cases that are not well predicted. For example, if one were to analyze job performance data (using some employment test as the predictor variable), it could happen that the test predicts success well for the majority of employees, but that there is a distinct group of employees for whom the prediction based on the employment test is consistently too low (a common problem in tests that rely heavily on verbal skills), and who would be unfairly discriminated against if the test were to be used as the sole criterion for selection.

Observed vs. residuals. Click the Observed vs. residuals button to plot a scatterplot of the observed versus the residual values. This plot is very useful for detecting outliers or groups of observations that are consistently over- or under-predicted.

Observed vs. squared residuals. Click the Observed vs. squared residuals button to plot a scatterplot of the observed versus the squared residuals values. This plot is also very useful for detecting outliers or groups of observations that are consistently over- or under-predicted.

Residuals vs. deleted residuals. Click the Residuals vs. deleted residuals button to plot a scatterplot of the residuals versus the deleted residuals values. Deleted residuals are the residuals that one would obtain if the respective case would be excluded from the estimation of the multiple regression (i.e., the computation of the regression coefficients). Thus, if there are large discrepancies between the deleted residuals and the regular standardized residuals, then we can conclude that the regression coefficients are not very stable, that is, they are greatly affected by the exclusion of single cases (see also Residuals and Predicted Values).

Bivariate correlations. Click the Bivariate correlations button to display a scatterplot of any two variables in the data file, regardless of whether or not they are included in the regression analysis. When you click this button, the standard variable selection dialog will first be displayed, in which you select the two variables to plot.

Partial residual plot (T). Click the Partial residual plot (T) button to display a partial residual plot for any variable currently in the regression equation. When you click this button, the standard variable selection dialog will first be displayed, in which you select the variable to plot. In a partial residual plot, the residual plus the contribution of the respective independent variable to the regression (i.e., residual+bi*xii) is plotted against the values for that independent variable. This plot is discussed in detail by Larsen and McCleary (1972).