How can I verify and "clean" data?

Use the options in the Verify Data dialog box, an interactive data-verification and cleaning facility, to enter the conditions to be met by the data. To access this dialog box:

Ribbon bar. Select the Data tab. In the Manage group, click Verify and from the menu, select Verify Data.

Classic menus. On the Data menu, select Verify.

Follow the standard syntax conventions common in STATISTICA to all those procedures that involve any operation of "selecting cases" based on their values (see What syntax can be used to create case selection/verification/recode conditions?). You can also save the current verification condition to a text file or open a file with previously saved conditions.

The verification can be as simple as checking whether values in a variable are "legal" (e.g., only 1 and 2 might be allowed for Gender) or whether they fall within allowed ranges of values (e.g., Age must be more than 0 and less than 200). It can also be as complex as checking multiple logical conditions that some values must meet in relation to other values.

Consider the following example of conditional verification:

If a person is a male or less than 10 years old, then the number of pregnancies for that person cannot be more than zero.

In order to apply these conditions, you would specify (for example):

Invalid if: (v1='MALE' or AGE<10) and PREGN>0

After you have entered your verification condition(s), click either the Find First button to select the first invalid case in your data file (after this first case has been selected, you can find the next case by selecting Find Next Invalid Case from the Data - Verify Data menu) or click the Mark All button to mark all of the invalid cases in the data file according to the Marked Cells spreadsheet layout.