# General ANOVA/MANOVA and GLM Notes - Random Effects

Some research questions require the assessment of the significance of random effects. For example, you may want to test the hypothesis that there are significant differences in students' algebra skills across different high schools in the state (i.e., the high school factor would represent a random effect). Note that if your hypothesis were more specific (e.g., algebra skills in rural high schools are better than in urban schools) the high school factor would be a fixed effect (the terms fixed effect and random effect were first used by Eisenhart, 1947). However, the hypothesis as it stands requires that you draw a random sample of high schools from the population of high schools in the state. Then a sample of students within each selected high school would be tested. Thus, each level of the high school factor (i.e., each high school that was selected into the random sample of high schools) does not represent a distinct level of the factor, but instead represents one possible level chosen from a population of levels, and you are interested in generalizing from the chosen levels to the population of levels (i.e., the hypothesis pertains to any differences between high schools in the state).

As a rule, a factor should probably be treated as a random factor if you would not choose the same levels of the factor for a replication of the study. In the present example, if you wanted to replicate the study, you would not test students in exactly the same high schools; rather, you would draw a new sample of high schools, that is, a new sample of levels of the high school factor. Thus, the high school factor is a random effect. On the other hand, if you wanted to replicate a learning experiment where subjects had to memorize lists of either 10 words or 20 words (level 1 and level 2, respectively, of a between-groups factor) you would again expose subjects to exactly the same levels, i.e., to lists of 10 words or lists of 20 words. Thus, this factor is a fixed effect. Refer to ANOVA textbooks (e.g., Lindman, 1974; Winer, 1962, 1971) if you are not familiar with the distinction between fixed and random effects.

Comparing Experiments. Another example of when you can specify a factor as random is in comparisons across different experiments. Sometimes, you may want to compare the results of different experiments that used identical designs. In that case, it is advisable to treat the between-groups factor "Experiment" as a random effect, because you are interested in drawing inferences about the entire "population of experiments" that could have been performed.

Specifying the Design. Note that designs with random factors can be analyzed via the General Linear Models (GLM) or Variance Components and Mixed Model ANOVA/ANCOVA modules. In order to specify a random factor, first specify all (between-groups and within-subjects) factors. Then click the Random factors button on the GLM Quick Specs Dialog - Options tab and select the factors in the design that should be treated as random factors.

Components of Variance, Denominator Synthesis. In fixed-effect designs, between effects are always tested using the mean squared residual as the error term. In mixed-model (random effects) designs, between effects are tested using relevant error terms based on the covariation of random sources of variation in the design. Specifically, this is done using Satterthwaite's method of denominator synthesis (Satterthwaite, 1946), which finds the linear combinations of sources of random variation that serve as appropriate error terms for testing the significance of the respective effect of interest. A basic discussion of these types of designs, and methods for estimating variance components for the random effects can also be found in the of the Variance Components and Mixed Model ANOVA/ANCOVA module.

Overparameterized and Sigma-Restricted Models. In one line of literature, the analysis of multi-factor ANOVA designs is generally discussed in terms of the so-called Sigma-restricted model. In short, the ANOVA parameters are constrained to sum to zero; in this manner, given k levels of a factor, the k-1 parameters (corresponding to the k-1 degrees of freedom) can readily be estimated (e.g., Lindeman, 1974, Snedecor and Cochran, 1989, p. 322). Another tradition discusses ANOVA in the context of the unconstrained and, thus, over-parameterized) general linear model (e.g., Kirk, 1968). The results for mixed (random and fixed effect) models can be different applying the two approaches. For this discussion, suppose you had a two-way mixed model design: Subject (random) by Treatment (fixed). It turns out that if you start with the Sigma-restricted model, the expected mean square for the random effect (i.e., Subject) does not contain the two-way interaction; however, without the Sigma restriction, it does (compare tables 4.6 and 4.7 in Searle, Casella, & McCulloch, 1992; the derivations of the expected mean squares in the two cases are also discussed in that reference).

The next question, of course, is which expected mean square is right; the answer to this question is "it depends." Searle, Casella, and McCulloch (1992) give a detailed discussion of the advantages and disadvantages of the two approaches (they conclude that the question "has no definitive, universally acceptable answer," p. 126). However, Searle, et al. also point out that when the parameters of the fixed effects "are being taken as realized values of random variables, it is not realistic to have them summing to zero" (Searle, et al. p. 123). Moreover, for the case of unbalanced data, this restriction is usually never even considered. Therefore, most general linear model routines (including the Variance Components and Mixed Model ANOVA/ANCOVA module) that estimate expected mean squares for mixed models will usually use the solution for the over-parameterized model.