Input Formats in Correspondence Analysis - Frequencies with Grouping Variables

If the Frequencies with grouping variables option button is selected [from the Input group box on either the Correspondence Analysis (CA): Table Specifications Startup Panel - Correspondence Analysis (CA) tab or the Multiple Correspondence Analysis (MCA): Table Specifications Startup Panel - Multiple Correspondence Analysis (MCA) tab], STATISTICA expects grouping variables with code values uniquely identifying each category as input, and, in addition, expects a variable containing frequencies or some other values with the respective measure of correspondence for the categories indicated by the respective grouping variables. For example, the file may look like this:

STAFFGRP

SMOKING

FREQUENCY

Sr.Manag

None

4

Sr.Manag

Light

2

Sr.Manag

Medium

3

Sr.Manag

Heavy

2

Jr.Manag

None

4

Jr.Manag

Light

3

Jr.Manag

Medium

7

Jr.Manag

Heavy

4

Sr.Empl

None

25

Sr.Empl

Light

10

Sr.Empl

Medium

12

.......

.......

.......

.......

.......

.......

.......

.......

.......

If you selected variables StaffGrp and Smoking for the analysis, and variable Frequency as the Variable with frequencies/counts (see below), then STATISTICA would assign the respective value for variable Frequency to each cell in the table identified by the grouping variables.

Selection of variables and codes. The required selection of variables and codes is the same as that described under the option Raw data (requires tabulation), except that in addition, you will be prompted to select the Variable with frequencies/counts (i.e., the variable containing the measure of correspondence, similarity, confusion, association, etc.). Note that only positive values or zero are allowed in that variable (e.g., STATISTICA does not permit negative frequencies).

Multiple references to the same cell. If there are multiple references to the same cell in the table, then the multiple values for the "frequency" variable will be summed up, and the sum of the values assigned to the respective cell in the table. For example, consider the following data specifying a 2 by 2 table:

GENDER

INCOME

FREQUENCY

MALE

HIGH

4

MALE

HIGH

6

MALE

LOW

3

FEMALE

HIGH

2

FEMALE

LOW

4

There are two references to the cell Male-High (i.e., the first two cases in the listing above). Thus, the frequency assigned to that cell will be 4+6=10. This way of handling multiple references will allow you to analyze subsets of tables that are coded in this manner. For example, suppose you had three grouping variables Gender, Income, and Occupation, and a fourth variable containing the frequencies for each cell in the three-way table. If you now only selected Gender and Income for the analysis, then STATISTICA would sum up all the frequencies in the two-way table defined by those two variables, and, in effect, compute the Gender by Income marginal frequency table, collapsing across the levels of the variable Occupation.