Noncentrality-Based Indices of Fit - General Theoretical Orientation

When attempting to assess how well a model fits a particular data set, one must realize at the outset that the classic hypothesis-testing approach is inappropriate. Consider common factor analysis. When maximum likelihood estimation became a practical reality, the Chi-square "goodness-of-fit" statistic was originally employed in a sequential testing strategy. According to this strategy, one first picked a small number of factors, and tested the null hypothesis that this factor model fit the population S perfectly. If this hypothesis was rejected, the model was assumed to be too simple (i.e., to have too few common factors) to fit the data. The number of common factors was increased by one, and the preceding procedure repeated. The sequence continued until the hypothesis test failed to reject the hypothesis of perfect fit.

Steiger and Lind (1980) pointed out that this logic was essentially flawed, because, for any population S (other than one constructed as a numerical example directly from the common factor model!) the a priori probability is essentially 1 that the common factor model will not fit perfectly so long as degrees of freedom for the Chi-square statistic were positive.

In essence, then, population fit for a covariance structure model with positive degrees of freedom is never really perfect. Testing whether it is perfect makes little sense. It is what statisticians sometimes call an "accept-support" hypothesis test, because accepting the null hypothesis supports what is generally the experimenter's point of view, i.e., that the model does fit.

Accept-support hypothesis tests are subject to a host of problems. In particular, of course, the traditional priorities between Type I and Type II error are reversed. If the proponent of a model simply performs the Chi-square test with low enough power, the model can be supported. As a natural consequence of this, hypothesis testing approaches to the assessment of model fit should make some attempt at power evaluation. Steiger and Lind (1980) demonstrated that performance of statistical tests in common factor analysis could be predicted from a noncentral Chi-square approximation. A number of papers dealing with the theory and practice of power evaluation in covariance structure analysis have been published (Matsueda & Bielby, 1986; Satorra and Saris, 1985; Steiger, Shapiro, & Browne, 1985). Unfortunately, power estimation in the analysis of a multivariate model is a difficult, somewhat arbitrary procedure, and such power estimates have not, in general, been reported in published studies.

The main reason for evaluating power is to gain some understanding of precision of estimation in a particular situation, to guard against the possibility that a model is "accepted" simply because of insufficient power. An alternative (and actually more direct) approach to the evaluation of precision is to construct a confidence interval on the population noncentrality parameter (or some particularly useful function of it). This approach, first suggested in the context of covariance structure analysis by Steiger and Lind (1980) offers two worthwhile pieces of information at the same time. It allows one, for a particular model and data set to express (1) how bad a fit is in the population, and (2) how precisely the population badness-of-fit has been determined from the sample data.