Variable Data Types

You can specify each variable's data type via the Variable specifications dialog box (available from the Data menu or by double-clicking on the column header). Statistica spreadsheet data files support the four basic data types listed below (note that the spreadsheets can also contain links to other data sources, embedded multimedia objects of various types, macros, user interfaces, etc., however, those items will not be used as direct input for analyses):

Double. The Double (Double Precision abbreviated) data type is the default format for storing numeric values in Statistica. Technically, the values are stored as 64-bit floating point real numbers, with 15-digit precision (1 bit for the sign, 11 for the exponent, and 52 for the mantissa). The range of values supported by this data type is approximately ±1.7*10308. Each numeric value can have a unique text label attached (see Text Labels Editor) of practically unlimited length when the Display format is General. This is the only data type that allows numbers containing decimals. When your data type is Double, each cell takes up 8 bytes of storage (plus the optional text label). Note that for the Double data type, the missing data code is -999999998.

Integer. If Integer is the data type, you can enter integers between and including -2,147,483,648 through 2,147,483,647. You cannot enter numeric values containing decimals into a variable of this type. Each numeric value can have a unique text label attached (see Text Labels Editor) of practically unlimited length when the Display format is General. When your data type is Integer, each cell takes up 4 bytes of storage; hence this data type offers a more economical way of storing numbers than when Double is the data type and is recommended for storing integer data especially in large data files. Note that for the Integer data type, the missing data code is the same as Double: -999999998.

Byte. If Byte is the data type, you can enter integers between and including 0 through 255. You cannot enter numeric values containing decimals into a variable of this type. Each byte value can have a unique text label attached (see Text Labels Editor) of practically unlimited length when the Display format is General. The advantage of specifying Byte as your data type is that it offers the most economical storage for values that are small integers, as each cell takes up only 1 byte of storage. Note that for the Byte data type, the missing data code is 255.

Text. The Text data type is optimized for storing sequences of any characters of practically unlimited length. Note that in Statistica, you can perform numerical analyses on text values, and in those circumstances, Statistica will assign unique numeric equivalents to all text values being processed (unlike the relation between the numeric data types listed above and their permanent text labels, the relations between text values and numbers are created ad hoc and are not stored by Statistica; hence, most likely different numbers will be created the next time if a text variable is included in numerical analyses). The length of a field reserved for text variable type is not constant and can be adjusted. Note that for the Text data type, the missing data code is always an empty string.

Transforming variables of type text. Statistica also supports various logical and other (e.g., concatenation) operations on variables of type text. For available transformations of these variables (variables of type text), see Transformation of Text Variables (Variables of Type Text). Note that Statistica Spreadsheets also support text labels for numeric values (these are labels "attached" to numeric values, which are used for display purposes only); when transforming the values with attached text labels, the respective transformations are performed on the numeric representations, and not on the text labels.

Why do you need different variable types? The difference between the text and the numeric types is straightforward; however, the main reason for having three types of numeric values is the storage efficiency. For most data files, that is not important and, thus, using the default (Double) data type is recommended. However, for very large data files, being able to switch to a 2 (or even 8) times more efficient storage (by using different data types) could make a difference between being able to perform the necessary analysis on a specific computer system or not.

See also: Display Formats, Spreadsheet Overview, and Range of Numeric Values That Can Be Entered or Stored in Cells.