Canonical Analysis - Introductory Overview

Canonical Analysis

Several STATISTICA modules compute measures of correlation to express the relationship between two or more variables. For example, the standard Pearson product moment correlation coefficient (r) measures the extent to which two variables are related; Nonparametric Statistics offers various measures of relationships that are based on the similarity of ranks in two variables; Multiple Regression and General Regression Models (GRM) allow you to assess the relationship between a dependent variable and a set of independent variables (GSR also performs multivariate analyses for multiple dependent variables); the General Linear Models (GLM) module is a full implementation of the so-called general linear model; Multiple Correspondence Analysis is useful for exploring the relationships between a set of categorical variables; the Generalized Linear/Nonlinear Models (GLZ) allows you to analyze nonlinear relationships between a set of continuous or categorical predictor variables and a continuous or categorical dependent variable; the General Partial Least Squares (PLS) module is particularly well suited for analyzing problems with very many predictor variables and one or more dependent variables.

Canonical correlation is an additional procedure for assessing the relationship between variables. Specifically, this module allows us to investigate the relationship between two sets of variables. For example, an educational researcher may want to compute the (simultaneous) relationship between three measures of scholastic ability with five measures of success in school. A sociologist may want to investigate the relationship between two predictors of social mobility based on interviews, with actual subsequent social mobility as measured by four different indicators. A medical researcher may want to study the relationship of various risk factors to the development of a group of symptoms. In all of these cases, the researcher is interested in the relationship between two sets of variables, and canonical correlation would be the appropriate method of analysis.

Canonical correlation is a special case of the general linear model; a complete and flexible implementation can be found in the General Linear Models (GLM) module (see also the General Regression Models (GRM) module for stepwise and best subset model building techniques).

In the following topics, we will briefly introduce the major concepts and statistics in canonical correlation analysis. We will assume that you are familiar with the correlation coefficient as described in Basic Statistics, and the basic ideas of multiple regression as described in the Introductory Overview section of Multiple Regression.

General Ideas

See also, Exploratory Data Analysis and Data Mining Techniques.