# Basic Statistics - Introductory Overview

Basic Statistics

The statistics included in this module are conventionally called basic statistics and are often discussed as a group because they are usually used as a group in the initial, exploratory phase of data analysis. In fact, they include tests that serve different purposes. In this introduction, we will briefly review each of the basic statistics available in this module. Further information can be found in the Examples and in statistical textbooks. Recommended introductory textbooks are: Kachigan (1986), and Runyon and Haber (1976); for a more advanced discussion of elementary theory of basic statistics, see the classic books by Hays (1988), and Kendall and Stuart (1979).

Specific types of basic statistics and tables available in this module, and their utility, are described in some detail in the subsequent help topics. Note that the Statistica Data Miner Interactive Drill-Down Explorer also provides highly interactive options for computing various statistical and graphical summaries for selected variables based on interactively chosen groups and sub-groups only.

### Descriptive Statistics Introductory Overview

Select Descriptive Statistics from the Basic Statistics and Tables Startup Panel to compute summary statistics such as means, medians, standard deviations, etc. Note that additional ordinal descriptive statistics (including user-specified percentages) can also be computed in the Nonparametric Statistics module.

### Correlations Introductory Overview

Select Correlation matrices to compute Pearson product-moment correlations (the term correlation was first used by Galton, 1888; the correlation coefficient was first introduced by Pearson, 1896). Use this basic statistics module to compute square and rectangular correlation matrices, as well as expanded format correlation matrices with pairwise n and significance levels. Note that nonparametric correlation statistics (e.g., Spearman R, Kendall tau, Gamma, etc.) can also be computed in the Nonparametric Statistics module and various distance measures can be computed in the Cluster Analysis module.

### t-test for Independent Samples Introductory Overview

Select t-test, independent... to compare means for two groups (within a variable). Note that various nonparametric tests for comparing groups are also available in the Nonparametric Statistics module; methods for comparing groups of (partially) censored observations are also available in the Survival Analysis module.

### t-test for Dependent Samples Introductory Overview

Select t-test, dependent samples to compare means among variables, measured in the same sample of cases (subjects, individuals; so-called dependent samples). Note that various nonparametric tests for comparing variables are also available in the Nonparametric Statistics module. t-test for Single Means Introductory Overview

Select t-test, single sample to test whether a single mean (i.e., a mean for a single population) is equal to a given value.

### Breakdown: Descriptive Statistics by Groups Introductory Overview

Select Breakdown & one-way ANOVA to compute various descriptive statistics (e.g., means, standard deviations, correlations, etc.) broken down by one or more categorical variables (e.g., by Gender and Region). It will also perform a one-way ANOVA on selected variables, and is particularly suited for analyzing single-factor designs with very many groups (e.g., with more than 200 groups; such as some designs used in agricultural research). Note that the Statistica Data Miner Interactive Drill-Down Explorer also provides highly interactive options for computing various statistical and graphical summaries for selected variables based on interactively chosen groups and sub-groups only.

### Frequency Tables Introductory Overview

Select Frequency tables to compute frequency tables (and histograms). Statistica provides various options for determining the categories for the frequency table (e.g., integer intervals, specific codes, etc.). You can also tabulate the data according to logical categorization conditions that can be directly typed in.

### Crosstabulation & Stub-and-Banner Tables Introductory Overview

Select Tables and banners to crosstabulate data and produce various types of crosstabulation tables. A wide variety of statistics for two-way tables are also available. Note that the Log-Linear module and the Correspondence Analysis module will also tabulate multi-way crosstabulation tables, and perform analyses on such tables.

### Multiple Responses/Dichotomies Introductory Overview

You can use the Basic Statistics and Tables module to compute summary tables for so-called multiple response variables as well as multiple dichotomies. Usually, categorical variables or factors divide the sample into exclusive groups, for example, into males and females. Obviously, only a single categorical variable is necessary to code the gender of the subjects in the data file. However, in some areas of research, the categories of interest are not mutually exclusive. For example, in a marketing research survey, respondents may be asked to list their top three preferences for soft drinks. A total of, for example, 60 different soft drinks may be mention in the survey, which can be coded accordingly in three categorical variables (first three preferences). In this case, the categories are not mutually exclusive: A person may mention three different soft drinks as his or her favorite. Such categorical variables are called multiple response variables (multiple dichotomies are similar in nature), and can be easily analyzed with the Basic Statistics and Tables module. Other Significance Tests Introductory Overview

Select Difference tests, r, %, means to display the Difference tests, r, %, means dialog box, in which you can perform various statistical tests for comparing means, correlations, percentages, etc. Refer to that dialog topic for more information.

### Probability Calculator Introductory Overview

Select Probability calculator to compute the significance levels for various statistics as well as critical values (e.g., critical value of chi-square significant at the p = .05 level; the term chi-square was first used by Pearson, 1900); various hypothesis testing facilities are also included (e.g., comparing means, correlations, percentages, etc.). For example, you can type in the desired significance level and sample size, and compute the magnitude of the correlation coefficient that would be significant at the p=.05 level.