Why Compare Individual Sets of Means? Usually, experimental hypotheses are stated in terms that are more specific than simply main effects or interactions. We may have the specific hypothesis that a particular textbook will improve math skills in males, but not in females, while another book would be about equally effective for both genders, but less effective overall for males. Now generally, we are predicting an interaction here: The effectiveness of the book is modified (qualified) by the student's gender. However, we have a particular prediction concerning the nature of the interaction: we expect a significant difference between genders for one book, but not the other. This type of specific prediction is usually tested via contrast analysis.

Contrast Analysis. Briefly, contrast analysis allows us to test the statistical significance of predicted specific differences in particular parts of our complex design. It is a major and indispensable component of the analysis of every complex ANOVA design. ANOVA/MANOVA has a uniquely flexible contrast analysis facility that allows you to specify and analyze practically any type of desired comparison (see Notes for a description of how to specify contrasts).

Post-hoc Comparisons. Sometimes we find effects in our experiment that were not expected. Even though in most cases a creative experimenter will be able to explain almost any pattern of means, it would not be appropriate to analyze and evaluate that pattern as if one had predicted it all along. The problem here is one of capitalizing on chance when performing multiple tests post-hoc, that is, without a priori hypotheses. To illustrate this point, let us consider the following "experiment."

Imagine we were to write down a number between 1 and 10 on 100 pieces of paper. We then put all of those pieces into a hat and draw 20 samples (of pieces of paper) of 5 observations each, and compute the means (from the numbers written on the pieces of paper) for each group. How likely do you think it is that we will find two sample means that are significantly different from each other? It is very likely! Selecting the extreme means obtained from 20 samples is very different from taking only 2 samples from the hat in the first place, which is what the test via the contrast analysis implies. Without going into further detail, there are several so-called post-hoc tests that are explicitly based on the first scenario (taking the extremes from 20 samples), that is, they are based on the assumption that we have chosen for our comparison the most extreme (different) means out of k total means in the design. Those tests apply "corrections" that are designed to offset the advantage of post-hoc selection of the most extreme comparisons. ANOVA/MANOVA offers a wide selection of those tests. Whenever you find unexpected results in an experiment you should use those post-hoc procedures to test their statistical significance.

See ANOVA/MANOVA Notes, Methods for Analysis of Variance, General Linear Models (GLM), General Regression Models (GRM), Variance Components and Mixed Model ANOVA/ANCOVA, and Experimental Design (DOE); to analyze nonlinear models, see Generalized Linear/Nonlinear Models (GLZ).

See also, A-Priori Comparisons of Least Observed Squares Means vs. Post-hoc Comparisons of Means, GLM, GRM, and ANOVA More Results - Post-hoc Tab, and GLM Hypothesis Testing - Post-hoc Comparisons for additional details.