HSUMMARIZE directive

Forms and prints a group by levels table for each test together with appropriate summary statistics for each group.


Option

GROUPS = factor
Factor defining the groups; no default i.e. this option must be specified


Parameters

DATA = variates
The data values

TEST = strings
Test type, defining how each variate is treated in the calculation of the similarity between each unit (simplematching, jaccard, russellrao, dice, antidice, sneathsokal, rogerstanimoto, cityblock, manhattan, ecological, euclidean, pythagorean, minkowski, divergence, canberra, braycurtis, soergel); default * ignores that variate

RANGE = scalars
Range of possible values of each variate; if omitted, the observed range is taken


Description

The HSUMMARIZE directive helps you to see which clusters, if any, are distinguished by each variate. It requires a factor to define the clusters, as well as the original data variates, together with their types and, optionally, their ranges. From this it prints a frequency table for each variate. Each table is classified by the grouping factor and the different values of the variate.

   The option and parameters of the HSUMMARIZE directive are the same as those of the HLIST directive, and are described there.

   For qualitative variates (TEST settings simplematching - rogerstanimoto) the values are integral, and for each group GenStat calculates an interaction statistic labelled chi-square. This statistic does not have a significance level attached to it, but it does draw attention to groups for which the distribution is markedly different from the overall distribution.

   For quantitative variates values are rounded to the nearest point on an 11-point scale (0-10). The interaction statistic is analogous to Student's t, and it draws attention to the groups for which the mean variate value is markedly different from the overall means (again with no significance level attached). Missing values are ignored in the computation of these statistics.

 

Option: GROUPS.

Parameters: DATA, TEST, RANGE.


Action with RESTRICT

You can restrict any of the DATA variates to do the calculations for only a subset of the units. If more than one of these is restricted, then they must all be restricted to the same set of units.