GROUPS directive
Forms a factor (or grouping variable) from a variate or text, together with the set of distinct values that occur.
Options
Parameters
Description
The GROUPS directive is designed to form factors from variates or texts. The variates and texts are specified by the VECTOR parameter, and the factors by the FACTOR parameter. With the simplest use of GROUPS you need specify no more than that, and each factor is defined to have a level for every distinct value of its corresponding variate or text. You need not have declared the factor already; it will be declared automatically if necessary.
Alternatively, you can divide the values of the variate or text into groups to be represented by the factor. You can use the LIMITS option to specify the range of values for each group. The limits vector is a text or a variate, depending whether the factor is being defined from a variate or a text; its values specify boundaries for the ranges. The BOUNDARIES option controls whether these are regarded as upper or lower boundaries; by default BOUNDARIES=lower. You can also ask GROUPS itself to set limits that will partition the units into groups of nearly equal size. You should then specify the NGROUPS option and leave the LIMITS parameter unset. (If you give both LIMITS and NGROUPS, NGROUPS is ignored.)
If you are defining a factor from a variate VECTOR, the LMETHOD option controls how the levels vector is formed. The default LMETHOD=median forms the levels from the median of the units in each group. There are also settings to allow them to be formed from minima or maxima. With any of these settings (median, minumum or maximum) you can specify a variate, using the LEVELS parameter, to store the levels that are produced; this can be done even if no factor is being formed, that is if no identifier is supplied for the factor by the FACTOR list. Alternatively, if you put LMETHOD=given, you can use the LEVELS parameter to supply your own levels. Finally, for LMETHOD=*, no levels are formed and any existing levels of the factor will be retained if they are still appropriate; otherwise the levels will be the integers 1 upwards. With any of these settings, you can use the LABELS parameter to specify labels for the factor.
Similar rules apply if you have a text VECTOR except that LMETHOD then governs how the labels are defined for the factor, and LEVELS can be used to specify its levels. The CASE option controls whether the case of the letters in the text strings is important. So, for example, if you set CASE=ignored the strings 'April' and 'april' will be put into the same group. With the default, CASE=significant, they would form different groups.
The LDIRECTION option controls the ordering of the levels (for a variate VECTOR) or the labels (for a text VECTOR) when LMETHOD is set to median, minimum or maximum. By default, they are sorted into ascending order, but you can set LDIRECTION=given to take them in the order in which they occur in the VECTOR. This may be useful, for example, if a text vector contains the names of days or of months in calendar order.
You can set the DECIMALS option to request that the values of a variate VECTOR be rounded to a particular number of decimal places before the groups are formed: for example DECIMALS=0 would round each value to the nearest integer.
You can redefine a VECTOR structure as a factor by setting option REDEFINE=yes and omitting to specify any corresponding identifier in the FACTOR list. This can be very useful on occasions when you are unable to define in advance which levels will occur in a set of data.
The PRINT option can be set to summary to print a summary of the contents of the FACTOR (numbers of values, missing values and levels).
Options: PRINT, NGROUPS, LMETHOD, DECIMALS, BOUNDARIES, REDEFINE, CASE, LDIRECTION.
Parameters: VECTOR, FACTOR, LIMITS, LEVELS, LABELS.
Action with
RESTRICT
GROUPS takes account of any restrictions on variates or texts in the VECTOR list, and will give missing values to the excluded units. If more than one vector is restricted, then each of their restrictions must be the same.