Predicting Recovery from Injury. This example is based on a data set reported in Neter, Wasserman, and Kutner (1985, page 649). Suppose a hospital administrator wanted to explore the relationship between the chances for long-term recovery of severely injured patients and the number of days spent in the hospital. The data file Patients.sta contains data for 15 patients; specifically, the file contains information on the number of days that each patient was hospitalized (in the variable Days) and an index of the prognosis for long-term recovery for each patient (in the variable Prognosis; larger values reflect a better prognosis).

Open the Patients.sta data file:

Ribbon bar. Select the Home tab. In the File group, click the Open arrow and from the menu, select Open Examples. The Open a STATISTICA Data File dialog box is displayed. Patients.sta is located in the Datasets folder.

Classic menus. From the File menu, select Open Examples to display the Open a STATISTICA Data File dialog box; Patients.sta is located in the Datasets folder.

Specifying the analysis. Start the Fixed Nonlinear Regression module:

Ribbon bar. Select the Statistics tab. In the Advanced/Multivariate group, click Advanced Models and from the menu, select Fixed Nonlinear Regression to display the Fixed Nonlinear Regression Startup Panel.

Classic menus. From the Statistics - Advanced Linear/Nonlinear Models submenu, select Fixed Nonlinear Regression to display the Fixed Nonlinear Regression Startup Panel.

Click the Variables button to display a standard variable selection dialog box. Select both the DAYS and PROGNOSIS variables for use in the model, and then click the OK button.

Note that at this point in the analysis it is not necessary to specify which variables will be the dependent or independent variables in the model.

Select the Review descriptive statistics, correlation matrix check box. Selecting this option will provide opportunities to review statistics and correlations later in the analysis.

Now, click the OK button to display the Non-linear Components Regression dialog box. You can select up to 10 transformations to be applied to each of the designated variables. Note that for the selected transformation to be successful for each transformed case, the data must be within the range specified as valid for the transformation; nonvalid cases will be eliminated from the analysis. When the OK button is clicked in this dialog box, additional variables will be created in memory for each variable and transformation. For this example, select the X**2, X**3, and LN(X) check boxes.

Now, click the OK button to display the Review Descriptive Statistics dialog box. The summary box at the top of the dialog box indicates that the specified transformations were successfully applied to all cases in the data set.

Reviewing the transformed variables. On the Quick tab, click the Correlations button to produce a spreadsheet of correlations between all combinations of the original variables and their respective transformations. In this spreadsheet, note that the correlation between DAYS (V1) and PROGNOSIS (V2) is highest (r = -0.977) when PROGNOSIS is logarithmically transformed (LN-V2).

Performing the analysis. Click the OK button in the Review Descriptive Statistics dialog box to proceed with the analysis.

On the Model Definition dialog box - Quick tab, click the Variables button to display a standard variable selection dialog box. Select LN-V2 from the Dependent variables list and variable DAYS from the Independent variables list, and then click the OK button.

Now, click the OK button in the Model Definition dialog box to calculate the model and display the Multiple Regression Results dialog box.

The model fits the data very well, with roughly 95% of the variability in prediction of LN(PROGNOSIS) explained by the model (see the adjusted R2 value in the summary box).

Now, click the Summary: Regression results button to display a spreadsheet of model parameters and their associated statistics.

Using the B values for Intercept and DAYS from the spreadsheet, the model can be expressed as:

PROGNOSIS =exp(4.037159 - 0.037974*DAYS)

Reviewing the residual statistics. In the Multiple Regression Results dialog box, select the Residuals/assumptions/prediction tab. Click the Perform residual analysis button to display the Residual Analysis dialog box.

Select the Residuals tab. Under Type of residual, select the Raw residuals option button, and click the Histogram of residuals button. The resulting plot shows that the residual data, though few in number, tend to approximate a normal distribution.

Note that you can use the Interactive Graphics Controls at the bottom of the graph window to adjust the transparency of the plot areas.

Finally, in the Residual Analysis dialog box on the Scatterplots tab, click the Predicted vs. observed button to produce a scatterplot of the predicted and observed values of the dependent variable.

The resulting plot shows that the predictions are good in a general sense, especially for higher LN(PROGNOSIS) values. Predicting the prognosis for patients whose hospital stay was shorter is less accurate.

See also, Fixed Nonlinear Regression - Index.