AKEEP directive
Copies information from an ANOVA analysis into GenStat data structures.
Options
Parameters
Description
AKEEP allows you to copy components of the output from an analysis of variance into standard GenStat data structures. You can save the information from the analysis in a save structure, using the SAVE option of ANOVA and then specify the same structure in the SAVE option of AKEEP. Alternatively, GenStat automatically stores the save structure from the last y-variate that has been analysed, and this is used as a default by AKEEP if you do not specify a save structure explicitly.
Several options are provided to save information about the analysis as a whole. The RESIDUALS and FITTEDVALUES options allow variates to be specified to store the residuals and fitted values, respectively. The residuals, like those saved by the RESIDUALS parameter of ANOVA, are taken only from the final stratum. The RMETHOD option controls whether these are simple residuals (like those printed by ANOVA - the default) or whether they are standardized by their variance. As an alternative, the CBRESIDUALS option saves residuals that incorporate the variability from all the strata. With an orthogonal design, these are simply the sum of the residuals from every stratum. For a non-orthogonal design, they are the data values minus the combined estimates of the treatment effects. Likewise, the CBCREGRESSION option allows you to save estimates of covariate regression coefficients that combine information from all the strata. (The estimates from each individual stratum can be saved using the CREGRESSION parameter, as described below.) The AOVTABLE option saves the analysis-of-variance table, as a pointer with a variate or a text for each column of the table. The pointer elements are labelled with the column labels of the table, and the variates contain missing values where the table has blanks. These can be printed as blanks by setting option MISSING=' ' in the PRINT directive.
The TREATMENTSTRUCTURE, BLOCKSTRUCTURE and WEIGHTS options can save the treatment and block formulae, and the weights variate (if any) that were used to specify the analysis. The EXIT option can save an exit code summarizing the properties of the design; see the description of ANOVA for details.
The parameters of AKEEP save information about particular model terms in the analysis. With the TERMS parameter you specify a model formula, which GenStat expands to form the series of model terms about which you wish to save information. As in ANOVA, the FACTORIAL option sets a limit on the number of factors in each term. Any term containing more than that limit is deleted. The subsequent parameters allow you to specify identifiers of data structures to store various components of information for each of the terms that you have specified. If there are components that are not required for some of the terms, you should insert a missing identifier (*) at that point of the list. For example
AKEEP Source + Amount + Source.Amount; MEANS=*,*,Meangain;\
SS=Ssource,Samount,Ssbya; VARIANCE=Vsource,*,*
sets up a table Meangain containing the source by amount table of means; it forms scalars Ssource, Samount and Ssbya to hold the sums of squares for Source, Amount and Source.Amount respectively, and scalar Vsource to store the unit variance for the effects of Source.
The structures to hold the information are defined automatically, so you need not declare them in advance. If you have declared any of the tables already, its classification set will be redefined, if necessary, to match the factors in the table that you wish to store. Thus Meangain here would be redefined to be classified by the factors Source and Amount, if it had previously been declared with some other set of classifying factors. Sizes of variates and symmetric matrices will also be redefined if necessary.
Many of the components are stored in tables, classified by the factors in the model term. Tables of means and effects are relevant only for treatment terms. Standard errors for a table of means can be saved using the SEMEANS parameter. For some designs, such as split-plots, different standard errors are needed for the means according to which pair of means is to be compared. The EQFACTORS option allows you to specify factors within the tables of means whose levels are assumed to be equal for the two means. Alternatively, the SEDMEANS parameter allows you to save a symmetric matrix containing a standard error of difference for each pair of means, and the VCMEANS parameter allows you to save a symmetric matrix with the variances and covariances for the means. The DFMEANS parameter saves a symmetric matrix with the degrees of freedom for comparing each pair of means. The rows and columns of these matrices are labelled by the factor name and level (or label if available) of the mean concerned.
Tables of partial effects (saveable only for treatment terms, using the PARTIALEFFECTS parameter) differ from the usual effects, presented by GenStat, only when there is non-orthogonality. The usual effects of a treatment term are estimated after eliminating the terms that precede it in the model, whereas the partial effects are those that would be estimated after eliminating the subsequent treatment terms as well. The TWOLEVEL option controls what it stored for terms whose factors all have only two levels. The settings response (the default) or Yates generate a scalar response; whereas TWOLEVELS=effects produces a table of effects. Replications are stored in tables if the values are unequal. For equal replications you can supply either a scalar or a table, but if the saving structure has not been declared AKEEP will define it as a scalar. Tables of residuals are available only for block terms, and the RMETHOD option controls whether or not they are standardized.
Sums of squares, numbers of degrees of freedom, efficiency factors and unit variances are saved in scalars. The unit variance of a treatment term is the residual mean square of the stratum where the term is estimated, divided by its efficiency factor and covariance efficiency factor. Thus you can calculate the estimated variance of any of the effects of the term by dividing its unit variance by the replication of the effect. The RTERM parameter can be used to save a formula containing the model term corresponding to the stratum in which a treatment term has been estimated. This can then be used as the setting of the TERMS parameter of a subsequent AKEEP statement to obtain further information about the stratum, for example its number of residual degrees of freedom.
There are two parameters for saving information about the covariates. To save the regression coefficients estimated in a particular stratum, you should specify the model term of the stratum with the TERMS parameter and a variate with the CREGRESSION parameter. GenStat defines the variate to have a length equal to the number of covariates, and stores the estimated regression coefficients of the covariates in the order in which they were listed in the COVARIATE statement. The CSSP parameter allows you to obtain sums of squares and products between the covariates for the specified model term. These are arranged in a symmetric matrix. The value in row i on the diagonal is the sum of squares for the term in the analysis of variance that has as its y-variate the ith covariate listed in the COVARIATE statement. The value in row i and column j is the cross-product between the effects estimated for the term in the analysis of variance of covariate i and those estimated for the same term in the analysis of covariate j.
The CONTRASTS, XCONTRASTS, SECONTRASTS and DFCONTRASTS parameters save information about contrasts. For each treatment term there will generally be several contrasts, so the information is stored in pointers with one element for each contrast. The elements are laballed by the name of the contrasts as it appears, for example, in the analysis-of-variance table.
The CBMEANS, CBSEMEANS, CBSEDMEANS, VCCBMEANS, DFCMEANS, CBEFFECTS, CBVARIANCE, DFCEFFECTS, CBCEFFICIENCY and STRATUMVARIANCES parameters save details of estimates that combine information from all the strata of the design, and the COMPONENTS parameter saves the stratum variance components.
In designs where there is partial confounding, and treatment terms are estimated in more than one stratum, options STRATUM and SUPPRESSHIGHER allow you to specify the strata from which the information is to be taken. This is relevant to tables of effects and partial effects, sums of squares, efficiency factors, unit variances, sums of squares and products between covariates, and information about contrasts. By default, GenStat searches all the strata, and takes the information from the lowest of the strata where the term is estimated. If you set the STRATUM option, only strata down to the specified stratum are searched. By setting SUPPRESSHIGHER=yes, you can restrict the search to only that stratum. You cannot save tables of means if you have excluded any stratum from the search. Likewise, tables of residuals and residual sums of squares cannot be saved for any of the excluded strata. If a term is not estimated in any of the strata that are searched, the corresponding data structures are filled with missing values.
The STATUS parameter saves an integer code that describes the type of term, and how it is estimated. If the term is a treatment term, the code also gives information about how its marginal terms are estimated. (For example, the interaction term A.B has the main effects A and B as margins.)
As explained in the description of the BLOCKSTRUCTURE directive, GenStat will set up an extra "factor" denoted *Units* if the block formula does not specify the final stratum explicitly. AKEEP allows you to refer to this "factor", if necessary, by putting the string '*Units*' (or '*units*' or '*UNITS*') in the TERMS formula. Thus, to save the residual sum of squares in these circumstances, you could put
AKEEP '*Units*'; SS=ResidSS
Options: FACTORIAL, STRATUM, SUPPRESSHIGHER, TWOLEVEL, RESIDUALS, FITTEDVALUES, CBRESIDUALS, CBCREGRESSION, TREATMENTSTRUCTURE, BLOCKSTRUCTURE, WEIGHTS, AOVTABLE, EQFACTORS, RMETHOD, EXIT, SAVE.
Parameters: TERMS, MEANS, SEMEANS, SEDMEANS, VCMEANS, EFFECTS, PARTIALEFFECTS, REPLICATIONS, RESIDUALS, DF, DFMEANS, SS, EFFICIENCY, VARIANCE, RTERM, CEFFICIENCY, CREGRESSION, CSSP, CONTRASTS, XCONTRASTS, SECONTRASTS, DFCONTRASTS, CBMEANS, SECBMEANS, SEDCBMEANS, VCCBMEANS, DFCMEANS, CBEFFECTS, CBVARIANCE, DFCEFFECTS, CBCEFFICIENCY, STRATUMVARIANCE, COMPONENTS, STATUS.