Input Formats in Correspondence Analysis - Frequencies without Grouping Variables

If the Frequencies w/out grouping vars option button is selected [from the Input group box on either the Correspondence Analysis (CA): Table Specifications Startup Panel - Correspondence Analysis (CA) tab or the Multiple Correspondence Analysis (MCA): Table Specifications Startup Panel - Multiple Correspondence Analysis (MCA) tab], STATISTICA expects that the selected variables (and cases) contain frequency values only (or some other measure of correspondence).

Selection of variables for simple correspondence analysis. If a simple correspondence analysis is specified, STATISTICA will treat each selected variable as a category or level of a categorical (column) variable, and each case as a category or level of a second categorical (row) variable. For example, the data in the example file Smoking.sta are organized in this manner:

CASE NAME

NONE

LIGHT

MEDIUM

HEAVY

SR. MANAGERS

4

2

3

2

JR. MANAGERS

4

3

7

4

SR. EMPLOYEES

25

10

12

4

JR. EMPLOYEES

18

24

33

13

SECRETARIES

10

6

7

2

Note that the column variables denote different categories for the (categorical) variable "Smoking frequency."

Selection of variables and codes for multiple correspondence analysis. If a multiple correspondence analysis is selected, then the data in the selected variables (and cases) are expected to define a valid Burt table (see MCA Introductory Overview). For example, the following data specify a valid Burt table:

 

Survival

   

Age

  

Location

NO

YES

<50

50-69

69+

TOKYO

BOSTON

GLAMORGN

SURVIVAL:NO

210

0

68

93

49

60

82

68

SURVIVAL:YES

0

554

212

258

84

230

171

153

 

 

 

 

 

AGE:UNDER_50

68

212

 

280

0

0

 

151

58

71

AGE:A_50TO69

93

258

0

351

0

120

122

109

AGE:OVER_69

49

84

0

0

133

19

73

41

 

 

 

 

 

LOCATION:TOKYO

60

230

 

151

120

19

 

290

0

0

LOCATION:BOSTON

82

171

58

122

73

0

253

0

LOCATION:GLAMORGN

68

153

71

109

41

0

0

221

The Burt table has a clearly defined structure. Overall, the data matrix is symmetrical. In the case of 3 categorical variables, the data matrix consists of 3 x 3 = 9 partitions, created by each variable being tabulated against itself, and against the categories of all other variables. Note that the sum of the diagonal elements in each diagonal partition (i.e., where the respective variables are tabulated against themselves) is constant (equal to 764 in this case). Technically, the Burt table is the result of the inner product of an indicator or design matrix; if the cases in that indicator matrix are assigned to categories via fuzzy coding (i.e., if probabilities are used to indicate likelihood of membership in a category, rather than 0/1 coding to indicate actual membership), then the off-diagonal elements of the diagonal partitions are not necessarily equal to 0. Note that complex coding schemes can easily be implemented, and the respective Burt table computed, via STATISTICA Visual Basic. Refer also to MCA Introductory Overview for additional details.

In addition to selecting the variables for the analysis, you also need to specify the structure of the Burt table. Click the Specify structure of table button (on the Multiple Correspondence Analysis (MCA): Table Specifications Startup Panel - Multiple Correspondence Analysis (MCA) tab) to display the Specify the dimensions of the table dialog, in which you specify the factor names (e.g., survival, age, location) and the number of levels for each factor (e.g., 2, 3, and 3). When processing the data, STATISTICA will automatically check whether the respective data specify a valid Burt table. Note that the Specify structure of table button is only available after Variables have been selected.

In addition to the variables defining the table for the analysis, you can designate some variables as Supplementary columns (variables). Note that unlike in simple correspondence analysis, where supplementary columns and rows can be added from the Correspondence Analysis Results - Supplementary points tab, in multiple correspondence analysis it is required that the supplementary columns also define a valid Burt table. Therefore, in this case, click the Variables with frequencies button [on the Multiple Correspondence Analysis (MCA): Table Specifications Startup Panel - Multiple Correspondence Analysis (MCA) tab] to specify all variables for the analysis, and then click the Supplementary columns (variables) button to display the Select the Variables that are Supplementary Column dialog, in which you select a subset of those variables as supplementary columns. When processing the data, STATISTICA will automatically check whether the selected subset of variables define a valid subset (Burt table) of the overall Burt table. The variables selected as supplementary columns will not be used for the computation of eigenvalues and eigenvectors (see Computational Details), but coordinate values will be computed for those columns and reported in the spreadsheet and plots of coordinates.