PRINT directive
Prints data in tabular format in an output file, unformatted file or text.
Options
Parameters
Description
The contents of GenStat data structures can be displayed, with appropriate labelling, using the PRINT directive. Output can be printed in the current output channel, or sent to other channels, or put into a text structure. PRINT has many options and parameters to allow you to control the style and format of the output but, in most cases, these can be left with their default settings.
For a quick display of the contents of a list of data structures, you need only give the name of the directive, PRINT, and then list their identifiers. For example
PRINT Source,Amount,Gain
The output is fully annotated with the identifiers, and with row and column labels or numbers, where appropriate. Factors are represented by their labels if available, and otherwise by their levels. The layout of the values is determined automatically by the size and shape of the structures to be printed, and by the space needed to print individual values. The output is arranged in columns; the structures are split if the page is not wide enough, so that one set of columns is completed before the next is printed.
With vectors, the default is to print their values in parallel. Alternatively, you can request that structures are printed in series, one below another, by setting option SERIAL=yes. Of course, if the structures to be printed have different shapes or sizes, their values can be printed only in series. The setting SERIAL=no is then ignored except that, to save space, any vectors or pointers are then printed across the page (that is, as though you had set ORIENTATION=across).
GenStat annotates each set of values by the identifier of the structure (but this can be controlled by the IPRINT option or the HEADING parameter described below), and it automatically chooses a suitable format. For a numerical structure, the default is to use a field of f characters. Generally, the value of f is 12, but another value can be defined using the FIELDWIDTH option of the SET directive. If the DECIMALS parameter was set when the structure was declared, this will define the number of decimal places in the output; otherwise, the number of decimal places is determined by calculating the number that would be required to print its mean absolute value to at least d significant figures. Generally, d is four, but this can be redefined using the SIGNIFICANTFIGURES option of the SET directive. Texts (and labels of factors) are usually printed in a field of f characters but this is extended if any of the strings in the text requires a wider field. You can define your own formats using the parameters FIELDWIDTH, DECIMALS, CHARACTERS, SKIP and JUSTIFICATION.
FIELDWIDTH and DECIMALS both operate in a straightforward way. The only potential complication is that a negative FIELDWIDTH can be used to print numbers in scientific format (for example 7.3 E1 instead of 73), with DECIMALS significant places. The DECIMALS parameter is ignored for strings, like the labels of the factors Source and Amount. For a numerical data structure, you can either set DECIMALS to a scalar to use the same number of decimals for all the values of the structure. Alternatively, if you want to use a different number of decimals for each value, you can supply a data structure of the same type and size as STRUCTURE. If DECIMALS is omitted or if it contains a missing value, a default is used which prints the mean absolute value to d significant figures, as above.
In the same way, the CHARACTERS parameter is ignored for numbers; for strings, it allows you to control the number of characters that are printed. By default, GenStat prints all the characters in each string of a text or factor label, unless the CHARACTERS parameter was set to a lesser number when the text or factor was declared.
Textual strings can contain typesetting commands to represent Greek and special symbols. (These are most useful in PRINT, but can be used in any directive that generates output.) The commands are converted automatically by GenStat to match the style of output (HTML, LaTeX, plain-text or RTF). The commands are all introduced by the character tilde (~). So, to use tilde as an ordinary character, you need to specify the special symbol ~{~} as defined below.
If GenStat finds a mistake in the syntax of a command, it will not issue a failure diagnostic but will output the remainder of the string (including any commands) as plain text. So, for example, legacy code containing tilde characters should continue to generate the output in its previous form. The following comands define Greek characters and various special symbols.
The character definitions (within the curly brackets) can be abbreviated. GenStat checks through the possibilities, in the order defined above, until it finds the first match. Greek characters in capital letters can be obtained by beginning the name of the character with a capital letter, for example ~{Sigma}; subsequent capital letters are irrelevant.
The style of font can be changed to bold or italic.
e.g. ~bold {Please note:}
e.g. \italic {Passer domesticus}
You can define subscripts and superscripts to use, for example, in equations.
You can use special characters in subscripts or superscripts, but fonts nust be specified outside the subscript or superscript. For example:
For additional flexibility, you can specify output information in either HTML, LaTeX or RTF. This will be inserted only into output constructed by GenStat in the same style. You can also supply information to be included only in plain-text output (which may, for example, be your translation of the HTML, LaTeX or RTF information).
The DREPRESENTATION parameter is used to specify how to print numbers that represent dates and times. The DECIMALS parameter is then ignored. The setting of DREPRESENTATION is either a scalar indicating a predefined format, or a string defining a custom format. The string for a custom format contains a sequence of keys to represent the required components of the date and time. The available keys are:
You can also insert one or more separators between the keys, any combination of space ( ), slash (/), hyphen (-) and comma (,).
To simplify the specification of the most commonly used formats, a range of standard pre-defined formats are available. These are specified by supplying a scalar containing one of the numerical codes in the left-hand column of the table below.
code format example
1 dd/mm/yy 03/08/98
2 dd/mm/yyyy 03/08/1998
3 d/m/yy 3/8/98
4 d/m/yyyy 3/8/1998
5 ddmmyy 030898
6 ddmmyyyy 03081998
7 ddmmmyy 03Aug98
8 ddmmmyyyy 03Aug1998
9 dd-mmm-yy 03-Aug-98
10 dd-mmm-yyyy 03-Aug-1998
11 dmmmyy 3Aug98
12 dmmmyyyy 3Aug1998
13 d-mmm-yy 3-Aug-98
14 d-mmm-yyyy 3-Aug-1998
15 d-mmmm-yy 3-August-98
16 d-mmmm-yyyy 3-August-1998
17 yymmdd 980803
18 yyyymmdd 19980803
19 yy/mm/dd 98/08/03
20 yyyy/mm/dd 1998/08/03
21 mmddyy 080398
22 mmddyyyy 08031998
23 mm/dd/yy 08/03/98
24 mm/dd/yyyy 08/03/1998
25 mmm-dd-yy Aug-03-98
26 mmm-dd-yyyy Aug-03-1998
27 yyyy-mm-dd 1998-08-03
28 weekday, dth mmmm yyyy
Monday, 3rd August 1998
29 weekday Monday
30 mmm-yy Aug-98
31 yy 98
32 yyyy 1998
33 dd-mmm-yyyy time100 03-Aug-1998 18:55:30.35
34 yyyy-mm-dd time 1998-08-03 18:55:30
(ODBC Standard format)
35 dd-mmm-yyyy time12 03-Aug-1998 6:55:30 pm
36 time24 18:55:30
37 time12 6:55:30 pm
38 hours 48:55:30.35
39 seconds 68538.35
The SKIP parameter allows you to place extra spaces between the values of each structure. By default, no extra spaces are inserted unless a value fills the field completely, when a single space will be inserted; there is also a blank line before the first printed line. SKIP can be set to either a scalar or a variate in which a positive integer n requests that n spaces are left and a missing value can be used to request a blank line.
The values can be left-justified by setting the JUSTIFICATION parameter to left, or centred by setting it to center or centre.
The FREPRESENTATION parameter controls the printing of the factor values. By default GenStat will print labels if there are any; if there are none, it prints the levels. The ordinals setting represents the values by the integers 1 upwards.
The ORIENTATION option is relevant only when you are printing vectors or pointers. By setting ORIENTATION=across, the values are printed in alternate lines, across the page. To ensure that these line up correctly, the fieldwidth is taken as the maximum of those specified for the printed structures, while the field used to print their identifiers is given by the RLWIDTH option (by default 13).
When there is too much output to fit across the page, GenStat will print the output in more than one block unless option WRAP is set to yes. Then GenStat simply wraps each line onto subsequent lines. This is likely to be useful mainly if you are printing the contents of the structures to be read by another program. You might then also wish to suppress the identifiers by setting option IPRINT=* and remove blank lines by setting option SQUASH=yes.
The default option setting, IPRINT=identifier, will label the output with the identifier of the structure. Putting IPRINT=identifier,extra will also include any text that has been associated with the structure by the EXTRA parameter when it was declared, while the setting associatedidentifier can be used when a table has been produced by the TABULATE and AKEEP directives, to request that the output be labelled with the identifier of the variate from which the table was formed.
If you are printing vectors in parallel columns down the page, you can use the HEADING parameter to specify a text for each vector. This will then be used as a heading for that column, instead of the information requested by IPRINT. Note, though, that setting IPRINT=* will suppress any heading texts of the vectors as well as their identifiers.
The width of each line can be controlled by the WIDTH option; the default is to take the full available width. The INDENTATION option specifies the number of spaces to leave before each line; by default there are none.
The CHANNEL option determines where the output appears. By default, the output is placed in the current output channel, but CHANNEL can be set to a scalar to send it to another output channel; the correspondence between channels and files on the computer is explained in the description of the OPEN directive. Alternatively, you can set CHANNEL to the identifier of a text to store the output. The text need not be declared in advance; any undeclared structure that is specified by CHANNEL will be defined automatically as a text. Each line of output becomes one value of the text and if the text already has values they will be replaced. You are most likely to want to do this in order to manipulate the text further. Remember, however, that if you print the text later on, its strings will be right-justified by default, so you will need to set JUSTIFICATION=left in the later PRINT statement to achieve the normal appearance of your output. The maximum (and default) line length of this text is the length of what is called the output buffer. This is likely to be 200 on most computers. If you intend to print it to an output file, you should set the WIDTH option as appropriate.
The MISSING option allows you to specify a string to be used instead of the default asterisk symbol to represent missing values. For example, you could set MISSING='unknown' or MISSING=' '.
PRINT can easily be used to print matrices and tables, by taking the default layout and labelling. For tables with more than one dimension, the usual layout has one factor across the page and the others down the page; tables with only one dimension are printed down the page. Several tables can be printed in parallel, provided they all have the same classifying factors. The tables are then printed in alternate columns, as though they formed a larger table with an extra factor (called the table-factor) representing the list of tables. This extra factor thus becomes another (in fact, the final) factor to be printed across the page.
This default layout can be changed using the ACROSS, DOWN and WAFER options. You may wish to do this simply by changing the factors which appear down and across the page. The ACROSS option can be set to a scalar to specify how many factors should be printed across the page, or to a list of factors to say which ones they should be. DOWN similarly specifies the factors to be printed down the page. However, you cannot specify a list of factors for one of these options and a scalar for any of the others. The table-factor can be represented in these lists by inserting a * in the required position; if you do not mention the table-factor in either list it remains as the last factor in the ACROSS list.
The WAFER option allows you to split the output up into subtables or "wafers". This is particularly useful if the tables have many classifying factors, or if the factors have very long labels. The setting can again be either a scalar or a list of factors (possibly including the table-factor). Each subtable has a heading indicating its position in the full table. If the table-factor is included in the wafer, the identifier of the appropriate table will be printed at the beginning of the label for that wafer; this does not mean that the table-factor itself has been moved, simply that the labelling has been rearranged to make it easier to read.
You need not specify all the options DOWN, ACROSS and WAFER. If you leave any of them out PRINT will deduce the missing information.
When a table has margins, usually they will all be printed. However, you can control which are printed, by specifying the following settings of the PMARGIN option:
The OMITMISSINGROW option also operates only on tables; if you set it to yes, PRINT will omit any lines of output where the tables contain only missing values.
You can control the space allowed for labels of the DOWN factors by using the RLWIDTH option. By default this is set to 13, but you might want something else if the labels are very small. If the width provided (by you, or implicitly) is inadequate, PRINT automatically resets it to accommodate the longest row label. You can suppress the labelling by the down factors by setting option RLPRINT=*, and the labelling of the across factors by setting CLPRINT=*.
When tables are produced by TABULATE GenStat sets an internal indicator for use by PRINT to indicate the appropriate label for any margins. When a single table is printed this name will be used by default. When printing tables in parallel, if they all have the same setting of the margin name indicator, the appropriate name is used. If they have different settings, or none at all (tables from sources other than TABULATE) the margins will be labelled Margin by default. You can change the label by setting the MNAME parameter. Tables printed in parallel must have the same label throughout, and GenStat will take the one specified for the first table in the list. But in serial printing, you can use a different margin name for each table.
The TABULATE and AKEEP directives also record the identifier of the variate from which the table was formed, and you can request that this be used to label the output, instead of the identifier of the table itself, by setting the IPRINT option to associatedidentifier.
The PUNKNOWN option controls the printing of the "unknown" cell of a table. The default action is to print this cell, labelled with the table identifier, but only if it contains a value other than missing value or zero. You can select one of five settings:
Options ACROSS, DOWN, WAFER, RLPRINT and CLPRINT also apply to matrices. By default, though, if you have several matrices they will be printed one after another on the page.
With symmetric matrices the only options of these that are relevant are RLPRINT and CLPRINT; a further setting integer is available for these to request that the rows or columns be labelled by the integers 1 onwards, as well as, or instead of the labels provided with the symmetric matrix: for example setting RLPRINT=integers and CLPRINT=integers, labels would identify the rows by integers and the columns with integers and labels.
The UNFORMATTED option can be used to send output to unformatted files. These can store values of data structures, so that they can later be input again using READ. This provides a convenient of way to free some space temporarily. It can also save computing time if you have a large data set that may need to be read several times. Input from character files is slow. So after vetting a large data set, it will be read more efficiently on future occasions if you transfer its contents to an unformatted file. As an alternative you could use backing store, but this stores the attributes of the structures as well as their values, and so access will take longer. You can also use these facilities to transfer data between GenStat and other programs. The only other options that are relevant to unformatted files are CHANNEL, REWIND and SERIAL. GenStat automatically creates an unformatted workfile, on channel 0, to which unformatted output is sent by default (by PRINT), and from which unformatted input is taken by default (by READ). This file is deleted automatically at the end of a GenStat run. It is usually quicker to read and write structures in series. Also the values of the structures transferred in parallel must all be of the same mode. Neither texts nor factors can be stored in parallel with values of the other, numerical, structures: scalars, variates, matrices or tables. As an example, we first open a file, and declare some variates, matrices and factors.
OPEN 'BDAT'; CHANNEL=3; FILETYPE=unformatted
VARIATE X,Y,Z; VALUES=!(11...19),!(21...29),!(31...39)
MATRIX [ROWS=2; COLUMNS=3; VALUES=11,12,13,21,22,23] M
FACTOR [LEVELS=3; VALUES=1,3,2,3,1,2,2,2,1,3] F
The next three statements store data for M and F on the file named BDAT and data for