SPEARMAN procedure
Calculates Spearman's Rank Correlation Coefficient (S.J. Welham, N.M. Maclaren & H.R. Simpson).
Options
Parameters
Description
SPEARMAN calculates Spearman's Rank Correlation Coefficient between pairs of samples. The samples can be stored in different variates and supplied in a list with the DATA pointer. Alternatively, they can all be placed in a single variate, and the GROUPS option set to a factor to indicate the sample to which each unit belongs. If the sample size is 8 or more (i.e. large enough for the approximation to be valid), the Student's t approximation is calculated. Otherwise SPEARMAN obtains significance levels from stored tables. The results can be displayed by use of the test setting of option PRINT, and saved using the options CORRELATION, T and DF. If more than two variates are specified, the full correlation matrix between all pairs of variables will be formed. The PRINT setting ranks causes the vector of ranks for each sample to be printed and correlations means that only the correlations will be displayed. The ranks from each sample can be saved using the RANKS parameter.
Options: PRINT, GROUPS, CORRELATION, T, DF. Parameters: DATA, RANKS.
Method
Spearman's Rank Correlation Coefficient is a measure of association between the rankings of two variables measured on N individuals (i.e. two vectors of length N). The correlation coefficient is calculated from the two vectors of ranks for the samples: let { Xi ; i=1...N } and { Yi ; i=1...N } be the vectors of ranks for sample 1 and sample 2 respectively, then the coefficient r is based on the vector of differences between ranks: { Di = Xi - Yi ; i=1...N } and is calculated by
r = 1 - 6 × ∑ i=1...N Di2 / [ N(N2-1) ].
If ties are present, then the statistic will be biased, and must be recalculated taking account of ties by:
r = ( ∑Xi2 + ∑Yi2 - ∑Di2 ) / ( 2 × √( ∑Xi2 × ∑Yi2 ) )
where ∑Xi2 = (N3-N)/12 - Tx ;
∑Yi2 = (N3-N)/12 - Ty ;
Tk = ∑ ( tj3 - tj )/12
and tj is the number of observations in the group with rank j.
The t-approximation for this statistic, T, is valid for samples of size 8 upwards, and is calculated by
T = r × √[ (N-2)/(1-r2) ].
It has approximately a t-distribution on N-2 degrees of freedom, and can be used for a test of the null hypothesis of independance between samples. (See for example Siegel 1956, pages 202-213.)
Exact critical values for sample sizes of 4-50 are given by Siegel & Castellan (1988) Table Q.
Action with
RESTRICT
If any of the variates in DATA is restricted, the statistic is calculated only for the set of units not excluded by the restriction.
References
Siegel, S. (1956). Nonparametric Statistics for the Behavioural Sciences. McGraw-Hill, New York.
Siegel, S. & Castellan, N.J. (1988). Nonparametric Statictics for the Behavioural Sciences (second edition). McGraw-Hill, New York.