Logical Operators Involving Missing Data

STATISTICA provides logical operators such as NOT, AND, and OR for use in case selection conditions and spreadsheet formulas. These expressions are well defined in how they work. The statement “A and B” requires that both A and B be true for the statement to be true. In contrast, “A or B” is true if A is true, if B is true, or if both A and B are true.

When missing data is present, how should these functions work? The approach that STATISTICA takes is described in this Help topic. With partial information, such as when one part of the expression contains missing data, we may be able to evaluate the expression. In other cases, we cannot. The following spreadsheet shows how the AND and OR functions evaluate in the presence of missing data. When the expressions are evaluated to true, a 1 is returned; 0 is returned for false.

The first 4 cases are easy to follow. Cases 5 and 6 have missing data.

Evaluating AND in the presence of missing data. When one part of the expression is true and the other is missing, then “A and B” evaluates to missing data. Since we do not know if “A” is true or false, it is missing, we cannot determine if “A and B” is true, even though we know “B” is true. The expression evaluates to missing.

When one part of the expression is false and the other is missing, we know that “A and B” is false. If either portion is false, the whole expression is false. So regardless of what “A” is, “A and B” is false because “B” is false.

When both portions of the expression are false, “A and B” evaluates to missing data.

Evaluating OR in the presence of missing data. When one part of the expression is true and the other is missing, then “A or B” evaluates to true. If either piece is true, the expression is true, so regardless of what “A” is in this expression, when “B” is true, “A and B” is true.

When one part of the expression is false and the other is missing, “A or B” evaluates to missing data. Since “B” is false, we must know “A” to evaluate “A or B”. Since “A” is missing, the expression evaluates to missing.  

When both portions of the expression are false, “A or B” evaluates to missing data.