Correlations Introductory Overview - Significance of Correlations

The significance level calculated for each correlation is a primary source of information about the reliability of the correlation. In order to facilitate identifying those coefficients that are significant at some desired level, the Product-Moment and Partial Correlations dialog in Basic Statistics and Tables provides an option to have STATISTICA highlight or mark significant correlations with a different color. As explained before (see Elementary concepts), the significance of a correlation coefficient of a particular magnitude will change depending on the size of the sample from which it was computed. The test of significance is based on the assumption that the distribution of the residual values (i.e., the deviations from the regression line) for the dependent variable y follows the normal distribution, and that the variability of the residual values is the same for all values of the independent variable x. However, Monte Carlo studies suggest that meeting those assumptions closely is not absolutely crucial if your sample size is not very large. It is impossible to formulate precise recommendations based on those Monte Carlo results, but many researchers follow a rule of thumb that if your sample size is 50 or more then serious biases are unlikely, and if your sample size is over 100 then you should not be concerned at all with the normality assumptions. There are, however, much more common and serious threats to the validity of information that a correlation coefficient can provide; they are briefly discussed in the Correlations Overview topics.