Form Similarity Matrix
See Also
This menu forms a similarity matrix from a set of variates. The similarity coefficient that is calculated allows variables to be qualitative, quantitative or dichotomous, or mixtures of these types; values of some of the variables may be missing for some samples. The values of a similarity coefficient vary between zero and unity: two samples have a similarity of unity only when both have identical values for all variables; a value of zero occurs when the values for the two samples differ maximally for all variables.

Data Values

This specifies the variates and the type of each variate. The similarity type of a variate determines how differences in variate values for each unit contribute to the overall similarity between units. Variates can be added to this list by double-clicking on a variate name within the Available data list. Alternatively, multiple selections can be transferred from the Available data list by clicking the button. When a variate name is transferred from the Available data list the type for the variate is set using the measure within the Default type of test list. The type for a variate can be changed within the Data Values list by double-clicking on the variate in this list and selecting a new similarity measure from the resulting dialog.

Similarity Measures

Jaccard is appropriate for dichotomous variables, simple matching for qualitative variables and the other settings give different ways for handling quantitative variables. The form of contribution to the similarity is as follows:

TypeContributionWeight
Jaccardif xi = xj = 1, then 1
1
if xi = xj = 0, then 0
0
if xi /= xj, then 0
1
Simple Matchingif xi = xj, then 1
1
if xi /= xj, then 0
1
Diceif xi = xj = 1, then 1
1
if xi = xj = 0, then 0
0
if xi /= xj, then 0
0.5
Sneath and Sokalif xi = xj, then 1
1
if xi /= xj, then 0
0.5
Russell and Raoif xi = xj, then 1
1
if xi = 0 or xj = 0, then 0
1
Antidiceif xi = xj = 1, then 1
1
if xi = xj = 0, then 0
0
if xi /= xj, then 0
2
Rogers and Tanimotoif xi = xj, then 1
1
if xi /= xj, then 0
2
Cityblock1 - |xi - xj| / range1
Manhattansynonymous with cityblock
Ecological1 - |xi - xj| / range1
unless xi = xj = 00
Euclidean1 - {(xi - xj) / range}21
Pythagoreansynonymous with Euclidean
Divergence1 - {(xi - xj) / (xi + xj)}21
Canberra1 - |xi - xj| / (|xi| + |xj|)1/p
Bray and Curtis1 - |xi - xj|xi + xj
Soergel1 - |xi - xj|max(xi, xj)
The measure of similarity is formed by multiplying each contribution by the corresponding weight, summing all these values, and then dividing by the sum of the weights.

Available Data

This lists data structures appropriate to the current input field. The contents will change as you move from one field to the next. Double-click on a name to copy it to the current input field; alternatively, you can type the name directly into the input field.

Default type of test

This specifies the default similarity used when items are added to the Data Values list. For example, when you double-click on a variate name within the Available Data list to transfer it to the Data Values list.

Name of new matrix

Specifies the name of the identifier of a symmetric matrix to save the similarity matrix.

Unit Labels

Allows you to specify a text or variate which is to be used to label the rows of the similarity matrix.

See Also