Merge Options - Variables Tab

Select the Variables tab in the Merge Options dialog box to display options to align (match) the cases of the second file with those of the first.

Mode. If the number and order of cases in the two data files are not precisely the same (i.e., if at least one case is missing in the second data file or there is one case that has no equivalent in the first file), all subsequent data will be merged with the wrong cases. To prevent this, specify a mode for merging from the options in the Mode group box.

Concatenate. Select the Concatenate option button to simply add the variables of the second file to those of the first (this is the default mode). If there are more cases in either File 1 or File 2, the option selected in the Unmatched Cases group box will determine what Statistica will do. By default, Statistica fills any extra cases with missing data.

Cartesian. Select the Cartesian option button to create a cross product of the variables. This is to say, every combination of every case will be created. For example:

File 1:

SEAT

DYNAMO

DOWN

OFF

UP

ON

File 2:

GEAR

TIRES

MEDIUM

HARD

LOW

SOFT

Result:

SEAT

DYNAMO

GEAR

TIRES

DOWN

OFF

MEDIUM

HARD

DOWN

OFF

LOW

SOFT

UP

ON

MEDIUM

HARD

UP

ON

LOW

SOFT

A possible application of a Cartesian merge is to generate an experimental design spreadsheet.

Match casenames. Select the Match casenames option button to match the cases from the second file with those of the first file based on the case names. For each case in both files, Statistica checks the case names and merges them if their respective case names match.

Match variables/Select. Select the Match variables option button to match the cases from the second file with those of the first file based on the values of specified variable pairs. When merging variables from two files, specify the matching variables in each data file via the Select button; for each case, Statistica checks the values of these variables in both data files and merges the cases only if their respective values match. Note that missing data is ignored.

Merge matches variable data types as follows:

  • Continuous (Byte, Integer, or Double)

  • Categorical (Text or Double with Text Labels); data will be sorted as Text, case insensitive (e.g., "ABC" and "abc" are treated the same)

Note that if more than one variable is selected for each of the files, the variables are paired up by the order as they appear in the spreadsheet. For example, if you select variables 1-3 from the File 1 and variables 2-4 from the File 2, the values of variable 1 from File 1 will be compared against the values of variable 2 from the File 2, etc. For a pair of cases between the File 1 and the File 2 to match, all of the values between their matching variables must match.

For example, select the variables ID and NAME as your matching variables for the files below:

File 1:

ID

NAME

HEIGHT

612

Chester

2.7

File 2:

ID

NAME

WEIGHT

612

Sylvester

8.7

Although these cases have matching values for the variable ID, the values in the NAME variables differ and Statistica will not consider these matching cases. When the variable WEIGHT is added to File 1, the case Sylvester will be added below Chester:

ID

NAME

HEIGHT

WEIGHT

612

Chester

2.7

 

612

Sylvester

 

8.7

Conversely, suppose that the name in File 2 is Chester:

File 2:

ID

NAME

WEIGHT

612

Chester

8.7

STATISTICA would recognize these cases as matching because the values in both matching variables are the same. The result would appear as such:

ID

NAME

HEIGHT

WEIGHT

612

Chester

2.7

8.7

Match Criteria. Select an option in the Match Criteria drop-down list to specify how to compare values when using Match casenames or Match variables. The default option is By Auto, which will auto-detect the most appropriate comparison method. By text compares values by textual content, and By numeric compares by numeric value. Note that if you select By numeric when comparing variables that contain double with text labels, the underlying values will be compared.

Unmatched cases. Select an option to specify the manner of dealing with unmatched cases when the two files are merged. Unmatched cases may result from unequal numbers of cases in the merged files, or because some of the cases do not meet the relational merge criteria.

Fill with MD. Select Fill with MD to pad unmatched cases with missing data.

Delete cases. If this option button is selected, cases from either file that cannot be matched will be removed from the merged file.

Generate Cartesian. Select Generate Cartesian to create a cross product between every unmatched case against every other case, i.e., if a unique case is found in only File 1 or File 2, then every combination of that case against every other case will be created.

Abort merge. When this option button is selected, the presence of unmatched cases in either file will cause an error message to be displayed and the merge procedure to be abandoned.

Multiple Cases. When either the Match casenames merge or Match variables merge options are selected in the Mode group box, the options in the Multiple Cases group box become available to specify what to do if duplicate matching cases are encountered.

Fill with MD. When the Fill with MD option button is selected, the file will be padded with missing data in the duplicate cases that were matched. This option (in conjunction with Match variables) is similar to Relational in Statistica 6.

Copy down. When the Copy down option button is selected, a Cartesian product will be generated for duplicate matches of the same value. This option (in conjunction with Match variables) is similar to Relational hierarchical in Statistica 6.

Drop File 1 multiples. Select the Drop File 1 multiples check box so that duplicate cases from the File 1 data set will not be included when using Match casenames or match variables.

Drop File 2 multiples. Select the Drop File 2 multiples check box so that  duplicate cases from the File 2 data file will not be included when using Match casenames or Match variables.

See also: Merge Variables Example.