Data Set Terminology

Cases and variables. STATISTICA data are organized by cases and variables. If you are unfamiliar with this notation, you can think of cases as the equivalent of records in a database management program (or rows of a spreadsheet), and variables as the equivalent of fields (columns of a spreadsheet). Each case consists of a set of values of variables. For example, suppose 4 persons (cases) completed 3 tests; there is a total of 5 variables in the data file: Gender (Male = male subject, Female = female subject), Education (C = college, H = high school), and 3 test scores (Test 1 through Test 3). Shown below is such a file.

Case names. The first column in the file can contain names of cases (optional).

Text values. STATISTICA offers comprehensive support for true text values (see Variable Types), which can be of practically unlimited length and include extensive within-cell formatting. However, for many statistical data analysis applications, it is useful to use text labels that can aid in the interpretation of their respective numeric values, as illustrated in the next paragraph.

Text labels. The two variables Gender and Education contain text labels. For example, suppose STATISTICA (or you) made the following assignments:

1 = Male, 2 = Female (for Gender); and

1 = C, 2 = H (for Education).

You can switch between the two views of data (numeric or text) in the spreadsheet by clicking the Text Labels button on the spreadsheet toolbar. After switching to numeric representation of these values, the file will look as follows:

Using text labels with numeric values (summary). As shown in the example (above), in STATISTICA each numeric value of a specific variable  (e.g., 1) can have a text label (e.g., Male) assigned to it. For more information, see How do I enter/edit the assignments between numeric values and text labels?.

Variable types. You can specify each variable's data type via the Variable Specs dialog (available on the Tools menu or by double-clicking on the column header). STATISTICA Spreadsheet data files support four basic data types: Double, Integer, Byte, and Text. See Variable Types for further details. Note that the spreadsheets can also contain links to other data sources, embedded multimedia objects of various types, macros, user interface, etc.; however, those items will not be used as direct input for analyses.