Nonparametric Statistics Notes - Ordinal Descriptive Statistics

Select Ordinal descriptive statistics (median, mode, ...) from the Nonparametric Statistics Startup Panel Quick tab to display the Descriptive Statistics dialog box, in which you calculate various ordinal descriptive statistics (median, percentiles, quartiles, range, quartile range) and other descriptive statistics (mean, harmonic mean, geometric mean, standard deviation, skewness, kurtosis, variance, average deviation, sum) for selected variables. You can also specify specific percentile values to be computed and displayed in the spreadsheet; by default STATISTICA will compute the quartile values, that is, the 25th and 75th percentiles. In addition to the standard descriptive statistics (minimum value, maximum value, mean, valid n) the statistics discussed below are computed for each variable.

Median. The median value is the value that "splits the sample in half," given the respective variable. Fifty percent of the cases will fall below the median, and fifty percent will fall above the median. If the median value is very different from the mean, then the distribution of data is skewed.

Mode. The mode is the value that occurs with the greatest frequency. The frequency with which the mode occurs is also displayed; if there is a tie (i.e., more than one value occurs with equal frequency) then the respective frequency column will contain the label "multiple" to indicate that more than one mode was found.

Geometric mean. The geometric mean is the product of all scores to the power of 1/N (one over the valid number of cases). The geometric mean is useful in instances when we know that the measurement scale is not linear. For example, in the area of psychometrics it is well known that the rated intensity of a stimulus (e.g., brightness of a light) is often a logarithmic function of the actual intensity of the stimulus (brightness measured in units of Lux). In this instance, the geometric mean is a better "summary" of ratings than the simple mean. STATISTICA calculates the geometric mean via the logarithm (log):

log(geometric mean) = [Sin=1(log(xi))]/n

where xi is the i'th score and n is the number of valid cases. Note that if a variable contains negative values or a zero (0), then the geometric mean cannot be calculated.

Harmonic mean. The harmonic mean is sometimes used to average frequencies (sample sizes). The harmonic mean is calculated as:

HM = n / S i=n1 (1/xi)

where: HM is the harmonic mean, n is the number of valid cases, and xi is the score for the i'th valid case. If a variable contains a zero (0) as a valid score, then the harmonic mean cannot be calculated (since it implies division by zero).

Variance and standard deviation. The variance and standard deviation are standard measures of variability (see Basic Statistics). STATISTICA will calculate the variance as the sum of squared deviations about the mean divided by n-1 (not n). The standard deviation is calculated as the square root of this value. The n-1 vs. n issue is usually of little practical importance. Technically, we most often want to estimate the variability of the population from which the current sample was drawn (for example, we would like to generalize our results to all males, given our random sample of males). In this case we should always use n-1 as the divisor in the computations; using n as the divisor results in purely descriptive statistics for the current sample.

Average deviation. The average deviation is another measure of variability. It is calculated as the sum of absolute deviations (mean for respective variable minus raw score) divided by n (number of valid cases).

Range. The range of a variable is also an indicator of variability. It is calculated as the largest valid score minus the smallest valid score.

Quartile range. The quartile range of a variable is calculated as the value of the 75th percentile minus the value of the 25th percentile. Thus, it is the width of the range about the median that includes 50% of the cases.

Skewness. As implied by the term, the skewness is a measure of the extent to which the distribution of the respective variable is skewed to the left (negative value) or right (positive value), relative to the standard normal distribution (for which the skewness is 0). The measure skewness is related to the third moment of the distribution. The skewness is defined as:

Skewness = n*M3/[(n-1)*(n-2)*s3]

where: M3 is equal to: S(xi-Meanx)3, n is the valid number of cases, and s3 is the standard deviation (sigma) raised to the third power.

Kurtosis. The kurtosis is a measure of how "wide" or "skinny" ("flat" or "peaked") the distribution is for the respective variable, relative to the standard normal distribution (for which the kurtosis is equal to 0). It is also sometimes referred to as the fourth moment of the distribution. The kurtosis is defined as:

Kurtosis = [n*(n+1)*M4 - 3*M2*M2*(n-1)] / [(n-1)*(n-2)*(n-3)*s4]

where: Mj is equal to: S(xi-Meanx)j, n is the valid number of cases, and s4 is the standard deviation (sigma) raised to the fourth power.

See Descriptive Statistics - Quick tab for further details.