Statistics in Crosstabulations

Crosstabulations generally allow us to identify relationships between the crosstabulated variables. The following table illustrates an example of a very strong relationship between two variables: variable Age (Adult vs. Child) and variable Cookie preference (A vs. B).

 

COOKIE: A

COOKIE: B

 

AGE: ADULT

50

  0

50

AGE: CHILD

  0

50

50

 

50

50

100

All adults chose cookie A, while all children chose cookie B. In this case there is little doubt about the reliability of the finding, because it is hardly conceivable that one would obtain such a pattern of frequencies by chance alone; that is, without the existence of a "true" difference between the cookie preferences of adults and children. However, in real-life, relations between variables are typically much weaker, and thus the question arises as to how to measure those relationships, and how to evaluate their reliability (statistical significance). The following review includes the most common measures of relationships between two categorical variables; that is, measures for two-way tables. The techniques used to analyze simultaneous relations between more than two variables in higher order crosstabulations are discussed in the context of the Log-Linear Analysis module and the Correspondence Analysis module.

Crosstabulation tables with up to 6 variables (6-way tables) can be generated automatically. Higher-way tables of practically unlimited order can be produced using the case selection conditions option. All measures of relations between crosstabulated variables are reported for two-way tables, even if they represent only "slices" of a larger multi-way table (see the description of the Crosstabulation Tables Results dialog).

See also:

Pearson Chi-square

Generalized Linear/Nonlinear Models (GLZ). An alternative way to analyze crosstabulation tables is also provided in the Generalized Linear/Nonlinear Models (GLZ) module. This module is an implementation of the generalized linear model and allows you to compute a standard, stepwise, or best subset multiple regression analysis with categorical (as well as continuous) predictors, and for binomial or multinomial dependent (response) variables (see Link function).