BCVALUES procedure

Forms values for nodes of a classification tree (R.W. Payne).


Options

GROUPS = factor
Groupings of the observations in the data set

TREE = tree
Tree for which predictions and accuracy values are to be formed

REPLACE = string
Whether to replace the values stored in the tree (yes, no); default no

PREDICTION = pointer
New predictions for the nodes of the tree

ACCURACY = pointer
New accuracy values for the nodes of the tree

REPLICATION = pointer
New replication tables for the nodes of the tree


Parameter

X = factors or variates
Values of the factors or variates used in the tree for the new data set


Description

When pruning a classification tree, it is best to use "accuracy" figures that are derived from a different set or sets of data from that which was used to construct the tree. BCVALUES allows these to be calculated, together with new predictions for the nodes of the tree.

   The TREE option specifies the tree for which the values are to be formed. The GROUPS option specifies a factor defining the groupings of the observations in the new data set, and the X parameter defines their levels for the factors or variates as used to construct the tree. You can set option REPLACE=yes to use the new values to replace those already stored in the tree. Alternatively, you can use the PREDICTION parameter to save the predictions, in a pointer. This has an element for each node of the tree (and with the same suffix as that node) pointing to a scalar storing the prediction for the node. Similarly, the ACCURACY parameter saves the accuracies, in a pointer to a set of scalars, and the REPLICATION parameter saves the replications of the groups at each node, in a pointer to a set of tables classified by the GROUPS factor. You can use these later to replace the prediction and accuracy values in the original tree by

CALCULATE Tree[]['accuracy'] = ACCURACY[]

& Tree[]['prediction'] = PREDICTION[]

& Tree[]['replication'] = REPLICATION[]

Alternatively, you may want to combine them first with other estimates, for example to form bootstrapped estimates.

 

Options: GROUPS, TREE, REPLACE, PREDICTION, ACCURACY, REPLICATION.

Parameter: X.


Method

BCVALUES uses the standard GenStat tree functions to obtain the necessary information about the tree.


Action with RESTRICT

BCVALUES takes account of any restrictions on the X vectors or on GROUPS.