How are dates represented in Statistica data files?

Date values of variables are stored internally as a single integer value that represents the number of days that have passed since January 1, 1900. The specific conversion of integer values to dates and vice versa is also affected by the setting of the Use Excel-style Julian date format option in the Options dialog box - Spreadsheets tab. For example, a date entered and displayed as 1/21/1968 is stored as 24858 (or 24856, depending on the setting of the Use Excel-style Julian date format option; see note below); the (optional) decimals are interpreted as time (see How is time represented in STATISTICA data files?). Date values stored in this manner can be used in subsequent analyses (e.g., in Survival Analysis in order to calculate survival times, see below) and transformed using arithmetic operations; at the same time, they can be displayed as dates in reports or graphs (e.g., used to label scale values). See below for Handling of Dates Prior to 1900.

Note: The Use Excel-style Julian date format option, located on the Spreadsheets tab of the Options dialog box, affects the way in which dates are interpreted and displayed when a date format is selected as the display format for variables. Specifically, the algorithm used in Microsoft Excel™ to convert Julian dates for display in conventional date formats (e.g., April 24, 1957 = 20934) starts "counting" at day 1, and interprets the year 1900 as a leap year. As a result, when copying and pasting between different applications (that do not employ the algorithm used in Microsoft Excel, such as Microsoft's SQL Server, which starts "counting" from day 0, and does not interpret 1900 as a leap year), you may see the displayed date changed by two days.

Date values can be displayed in the spreadsheet in numeric format or in one of several predefined date display formats (e.g., 1/6/64, 6-Jan-64, Jan-1964, 01/06/64, 01/06/1964, 6-Jan).

To change the date display format, select Date in the Display format group box of the Variable specifications dialog box (accessible by double-clicking on a variable name in the spreadsheet or: Ribbon bar - select the data tab; in the Variables group, click Specs. Classic menus - on the Data menu, select Variable Specs), and choose one of the predefined display formats.

When you enter data into a new variable using a format that is recognized as valid in the Display format group of the Variable specifications dialog (e.g., Time, Date, Currency, etc.), STATISTICA displays the Auto Format Cells dialog box. In this dialog box, you can either apply the format to the individual cell or to the whole column.

You can create a single date variable from two (month, year or day) or three (day, month, year) variables as well as split a single date variable into two or three variables via the Date Operations dialog box. To display this dialog box: Ribbon bar - Select the Data tab; in the Transformations group, click Date/Time. Classic menus -  On the Data menu, select Date Operations.

Handling of Dates Prior to 1900

STATISTICA represents dates as a double precision number with the integer part the number of days since a base date and the fractional part the time as a fraction of a day (same as Excel).

When using Excel-style Julian dates (the default setting in the Options dialog box - Spreadsheets tab), 0 was undefined and 1 is 1/1/1900; otherwise, 0 is 1/1/1900. (Interestingly enough, when excel style is in use, day 60 is 2/29/1900, which did not exist. The original Excel erroneously specified 1900 as a leap year, and it has continued this in every version since; STATISTICA deliberately replicates this.)

Visual Basic Variant dates use the same basic scheme, but use negative values for dates prior to 1900. STATISTICA allows negative Julian dates as far back as 1/1/101. This assumes that the Gregorian calendar extends back that far, which it does not (same as Visual Basic Variant dates).

Just as with Visual Basic Variant dates, the time portion is the absolute value of the fractional part of the number, and a time part of 0 is time 00:00:00, midnight.

So therefore:

9.25 = 01/09/1900 06:00 AM

10.00 = 01/10/1900 12:00 AM, midnight

10.25 = 01/10/1900 06:00 AM

Note that 0 is unusual; for example, 0.25 and -0.25 are the exact same date-time because both have a datepart of 0 and a timepart of 0.25.

Other negative date-times do not have this overlap:

-10.25 = 12/21/1899 06:00 AM

-10.00 = 12/21/1899 12:00 AM, midnight

-9.25 = 12/22/1899 06:00 AM