Spreadsheet Formulas
 Syntax Summary
Recoding variable values, performing case selection for analyses, or
verifying quality of data involve a comparison of conditions or values
encountered in the spreadsheet with conditions specified by an expression
to determine a further action for each case (i.e., respectively assigning
a new variable value, including the case in an analysis, or verifying
that the data associated with a case are correct). Assigning variable
values via a spreadsheet formula involves casebycase calculation of
values for a variable based directly on the values of other spreadsheet
variables and/or parameters not included in the spreadsheet. For data
transformations more extensive or more complex than can be accomplished
using the variable recode and spreadsheet formula facilities provided,
use STATISTICA
Visual Basic.
Follow these guidelines to create case selection, verification, or recode
conditions.
Refer to variables
by either their numbers (e.g., v1 =
1) or their names (e.g., Gender
= 1). Note that you can type variable names in either upper case
or lower case letters (i.e., "GENDER"
is equivalent to "gender").
Note also that v0 refers to the
case number when used in expressions.
In expressions, enclose
text labels of a variable in single quotation marks (e.g., v1
= 'MALE'). Note that you can type text labels in either upper case
or lower case letters (i.e., 'YES'
is equivalent to 'yes').
In expressions or spreadsheet
formulas, enclose variable names containing special characters (e.g.,
spaces, plus or minus signs) in single quotation marks. If the single
quotation mark itself occurs in the variable name, use double quotation
marks instead (e.g., "A's Score").
Note that if double quotation marks are used in the name, then the variable
name must be placed in single quotation marks.
Syntax conversions
for spreadsheet formulas. Spreadsheet formulas (specified on the
Variable
specifications dialog accessible via the Data
 Variable Specs
menu) must start with an equal sign. When you enter a label that starts
with an equal sign, STATISTICA
will assume that it is a formula and will verify it for formal correctness.
Note that a semicolon after a formula starts a comment: e.g., = v1 + v2; this is a comment.
Missing values.
The IsMD(x) function will return a value
of true (1) if the passed expression
is a missing data value. For additional details on how missing values
are treated in the evaluation of logical and arithmetic expressions, see
Logical
and Arithmetic Expressions Involving Missing Data Values Always Evaluate
to FALSE or Missing Data.
Operators. A
number of arithmetic, relational, and logical operators are available
for creating expressions for recoding, case selection, and data verification
or for creating spreadsheet formulas.
Arithmetic: +, ,
*, /,
** or ^
(exponentiation), ( )
Relational:
=
(equal to)
<>,
>< (not equal to)
<
(less than)
>
(greater than)
<=
(less than or equal to)
>=
(greater than or equal to)
Logical:
AND
(equivalent to &)
OR
(equivalent to !)
NOT
(equivalent to ~)
IIF
(ternary operation)
Note that a common error is caused by omitting parentheses needed to
adjust for the precedence of operators; for example, the expression x > 0 and x
< 1 is incorrect and needs parentheses: (x
> 0) and (x < 1) because relational operators (>,
<) have a lower precedence
than the conjunction (and).
Math functions.
Math functions can be used in expressions for recoding, case selection
or data verification as well as in spreadsheet formulas for calculating
variable values. If the value of any variable used in the expression or
formula is missing (in the current case), then the expression evaluates
to missing data (for the current case).
Abs(x) 
absolute value of x 
Arccos(x) 
arc cosine of x 
Arcsin(x) 
arc sine of x 
Arctan(x) 
arc tangent of x 
Cos(x) 
cosine of x 
CosH(x) 
hyperbolic cosine of x 
Exp(x) 
e to the power of x 
Hypot(x,y) 
returns hypotenuse of x and y (square root(x2 + y2)) 
Log(x) 
natural logarithm of x 
Log2(x) 
binary logarithm of x 
Log10(x) 
common logarithm of x 
Max(x,y) 
returns the greater of x and y 
Min(x,y) 
returns the lesser of x and y 
Rnd(x) 
random number in the range of 0 to x 
RndNormal(x) 
random number from a normal distribution with
mu = 0 and sigma = x 
RndPoisson(x) 
random number from a Poisson distribution with
parameter x 
Sign(x) 
sign of x: if x>0
then +1, if x<0 then 1, if x = 0 remains 0 
Sin(x) 
sine of x 
SinH(x) 
hyperbolic sine of x 
Sqrt(x) 
square root of x 
Tan(x) 
tangent of x 
TanH(x) 
hyperbolic tangent of x 
Trunc(x) 
truncate x to an integer "towards zero" 
Uniform(x) 
random value in range 0 to x (same as Rnd(x)) 
Function names are not case sensitive, i.e., Log(x)
is the same as log(x) or LOG(x). As indicated in the function
list above, math functions accept one to two arguments depending on the
function. The position of the function in the expression or spreadsheet
formula will be replaced by the return value of the function. Numeric
values (e.g, Sqrt(155)), variable
names (e.g., Max(SCORE1,SCORE9)),
or variable numbers (e.g., Log(v8))
are acceptable arguments. Additional arguments acceptable to math functions
are expressions that evaluate to a number (e.g., Max(v7,(v5+v8BASELINEVALUE)/3))
or functions that return a numeric result (e.g., Sin(Sqrt(v5))).
Some commonly used constants can also be specified in expressions and
formulas by reference: e.g., Pi
= 3.14... Euler (e) = 2.71...
Statistics functions.
Nine statistics functions that accept lists of values and/or ranges and
arguments, and adjust to missing data, are also available for use in recoding,
case selection, and verification expressions as well as in spreadsheet
formulas:
Mean(x1, x2,..xn) 
mean of n arguments 
Median(x1, x2,..xn) 
50th percentile of n arguments 
Perc25(x1, x2,..xn) 
25th percentile of n arguments 
Perc75(x1, x2,..xn) 
75th percentile of n arguments 
Statmax(x1, x2,..xn) 
maximum of n arguments 
Statmin(x1, x2,..xn) 
minimum of n arguments 
Stdev(x1, x2,..xn) 
standard deviation of n arguments 
Sum(x1, x2,..xn) 
sum of n arguments 
Validn(x1, x2,..xn) 
number nonmissing of n arguments 
Note that the Statmin and Statmax functions have the stat prefix
to distinguish them from the arithmetical Max
and Min (math) functions discussed
above. Like math functions, the names of statistics functions are not
case sensitive when used in expressions and formulas.
All statistics functions ignore arguments that contain missing values,
basing their results on nonmissing data only. Unlike math functions,
statistics functions will not evaluate to a missing result unless all
arguments are missing values or an argument is encountered that evaluates
to an undefined value (e.g., square root of a negative number or division
by zero). Also unlike math functions, each statistics function accepts
any number of arguments placed in parentheses and separated by commas.
The following are acceptable statistics function arguments:
Numbers 
e.g., Mean(18,80,120,68,40) 
Variable names 
e.g., Mean(SALARY,1050,BONUS,500) 
Variable numbers 
e.g., Stdev(120,v3,v2,v8,255) 
Ranges of variables designated by name or number
(use colons to define ranges) 
e.g., Sum(v54,COST1:COST5,2550,v23:v35,1575,OVERHEAD) 
Expressions that evaluate to a number 
e.g., Mean(500,(v6+456)/v3),TRAVELEXPENSE,v8) 
Functions that return a numeric result 
e.g., Mean(220,Sqrt(v8+v7),v12) 
Distribution functions
and their integrals. STATISTICA
provides a predefined broad selection of distribution functions, their
integrals and inverse distribution functions that can be used in spreadsheet
formulas and in recoding, case selection, and verification expressions
like all other functions.
Below is a listing of all available distributions
(parameters are given in parentheses):
Distribution 
Density/ Probability Function 
Distribution Function 
Inverse Distribution Function 
Beta 
beta(x,n,w) 
ibeta(x,n,w) 
vbeta(x,n,w) 
Binomial 
binom(x,p,n) 
ibinom(x,p,n) 

Cauchy 
cauchy(x,h,q) 
icauchy(x,h,q) 
vcauchy(x,h,q) 
Chisquare 
chi2(x,n) 
ichi2(x,n) 
vchi2(x,n) 
Exponential 
expon(x,l) 
iexpon(x,l) 
vexpon(x,l) 
Extreme 
extreme(x,a,b) 
iextreme(x,a,b) 
vextreme(x,a,b) 
F 
F(x,n,w) 
iF(x,n,w) 
vF(x,n,w) 
Gamma 
gamma(x,c) 
igamma(x,c) 
vgamma(x,c) 
Geometric 
geom(x,p) 
igeom(x,p) 

Laplace 
laplace(x,a,b) 
ilaplace(x,a,b) 
vlaplace(x,a,b) 
Logistic 
logis(x,a,b) 
ilogis(x,a,b) 
vlogis(x,a,b) 
Lognormal 
lognorm(x,m,s) 
ilognorm(x,m,s) 
vlognorm(x,m,s) 
Normal 
normal(x,m,s) 
inormal(x,m,s) 
vnormal(x,m,s) 
Pareto 
pareto(x,c) 
ipareto(x,c) 
vpareto(x,c) 
Poisson 
poisson(x,l) 
ipoisson(x,l) 

Rayleigh 
rayleigh(x,b) 
irayleigh(x,b) 
vrayleigh(x,b) 
Student's t 
student(x,df) 
istudent(x,df) 
vstudent(x,df) 
Weibull 
weibull(x,b,c,q) 
iweibull(x,b,c,q) 
vweibull(x,b,c,q) 
For more extensive data recoding, use the Recode Values of Variable dialog
accessible from the spreadsheet. See also: Spreadsheet
Formulas  Examples, Spreadsheet Formulas
 Overview, Predefined Functions,
Syntax Operators, and Batch
Transformation Formulas.
For transformations of text variables (variables of type text), see
also, Transformation
of Text Variables (Variables of Type Text). Note that STATISTICA
Spreadsheets also support text labels for numeric values (these are
labels "attached" to numeric values, which are used for display
purposes only); when transforming the values with attached text labels,
the respective transformations are performed on the numeric representations,
and not on the text labels. In the event that a conflict arises between
a variable name and another variable's text values (e.g., a data set has
one variable named "Cats" and another variable that has a text
value of "Cats"), the variable name will take precedence over
the text value. To specify the text value in selection conditions, append
a $ to the end of the text value (e.g., "Cats"$).