A Microsoft Excel file or Quattro Windows (WQ1, WB1/2/3) file can contain several worksheets.
This menu can be used to open one or more of these within GenStat. Worksheets and named ranges are
listed, prefixed by S: and R: respectively. If a single worksheet or named range is
to be opened you can specify a subrange of the data to be loaded.
If named ranges
are specified within the file then these have the advantage that both the sheet and
cell range are specified permanently in the Excel file. So, these are automatically
updated if further rows or columns are inserted into the cell range.
Columns whose names are suffixed with a ! character will be automatically
converted into factors. Similarly, if column names end with a $ or #
character, then these will be converted into a text and variate respectively. As well
as supplying a conversion character, column format information can be appended to
a column name. For example, a suffix of :0, :1, :2 ... :9 specifies that the
column should be displayed with the given fixed number of decimal places. A suffix
of :D specifies the column is to be displayed in date format. This will use
GenStat's default date format. Similarly, a suffix of :T specifies that the
column is to be displayed in time format.
The order of factor levels and labels can be supplied within an Excel file by adding
a comment to the cell containing the column name. Within the comment the factor levels
must start with a ! and labels must start with !t on a new line. The levels
and labels should then be entered within brackets, for example, for levels this could be
!(100,90,50,10) and for labels this could be !T(Control,A,B,C) or !t('Control','A','B','C').
The order of the items in the comment will define the order of the levels or labels
in the factor. If a column just contains ordinal values (i.e. 1...n), then the
comment can still be used to assign labels or levels to these groups. The first
item in the comment will define the level or label for group 1 in the factor and so on.
A column description can also be supplied in the comment as a separate line, but this must
not start with an exclamation mark.
The following example shows two comments that have been entered into columns within Excel (using the
Excel menu item Insert | Comment). Each column will be opened in GenStat as a factor
using the defined levels/labels and will include a column description.
Select All
This can be used to select all the items within the list of worksheets and named ranges.
Restricted Cell Range
You can load a subset of the data from a worksheet by specifying the position of
diagonally opposite corners of the rectangle that spans the required range.
For example B12:E18 will load seven rows and five columns starting at row 12
of the second column.
Partial column or row information can be provided also:
- Just the starting cell can be given (e.g. B12) and all cells
to the lower right will be used.
- Just column letters provided (e.g. B:E), and all rows in these
columns will be used.
- Just row numbers provided (e.g. 12:18), and all columns with
data in these rows will be used.
Read Title from cell range
When selected, the cell range (e.g. A2:B3) or named range
(which must be located on the same sheet as the data) specified in
the space provided will be used to read text for the
spreadsheet title. Note that
text for the different cells within this range will be concatenated.
Select columns for Inclusion
Allows you to select a subset of columns from the worksheet.
This opens the Select Columns From Worksheet dialog
once the data has been read in.
Sort Factor Levels
When selected, columns loaded as factors will have their levels
(or labels for a text column) sorted into ascending order.
Suggest Columns to be factors
When selected, if columns contain few unique repeated values then you will be prompted
with a dialog where these columns can be converted to factors. The
Suggest converting columns with <= N unique items option within the
Column tab of the spreadsheet options can be used to control the limit for the number
of unique items.
Drop Empty rows
When selected, any rows that do not include any data are removed from the resulting spreadsheet.
Keep Empty columns
By default, any columns that do not contain data are removed from the GenStat spreadsheet. However,
these columns can be contained in the spreadsheet by selecting this option.
Data contains Variates & factors only
When selected, any column containing text will be converted to either a variate
or factor. If the number of labels within the spreadsheet column
is less than 20% of the total number of rows, then the column will be
converted into a variate, otherwise the column will be converted into a factor.
Skip first X non-empty rows
When selected, the first specified number of rows will be excluded from the spreadsheet. This is
useful to exclude any comments that made have been made in the spreadsheets. Note that text in these
rows can still be used for column names or descriptions.
Read Column names from File
Controls whether column names are specified within the file.
| Yes if all labels | Read the column names from the row specified in the
Column Names Row option if all the cells in this row contain
labels only or are missing. A default column name will be generated for a missing cell. |
| Yes | Use the cells specified in the Column Names Row option
. If a cell contains a number, then it will be prefixed with "_" to ensure it is a valid GenStat identifier name. |
| No | Do not read the column names from the file and generate default
names for the columns. |
Column Names in Row
Specifies the row number to use for column names.
Column Descriptions in Row
When selected, the cells in the specified row number
will be used for column descriptions
(the EXTRA keyword for GenStat structures).
Row Numbers
Controls whether the row numbers are relative to the data range or relate to a whole worksheet.
| Relative | Row numbers start from the top of the restricted cell range. |
| Absolute | Row numbers start from the top of the worksheet and thus correspond to
the absolute row numbering in Excel/Quattro. |
Text to Number
This controls how text labels are interpreted as numbers when a label is
changed to a value when converting to a variate.
| Strict | Only labels that contain just numeric data are converted, otherwise
they are set to a missing value (e.g. '10' -> 10, '1O' -> *.) |
| Single | Single characters and labels containing just numeric data are
interpreted as numeric values, otherwise they are set to a missing value (e.g. o,O -> 0, i,I,l,L -> 1, s,S-> 2, z,Z -> 5, comma -> decimal point, '10' -> 10, 'Io' -> *). |
| Common | Multiple characters and labels containing just numeric data are
interpreted as numeric values, otherwise they are set to a missing value
(e.g. 'Io' -> 10, '23X' -> *). |
| Standard | Multiple characters and labels containing just numeric data are
interpreted as numeric values, but text is truncated at the end of a number
(e.g. '23X' -> 23, 'A2X3' -> *). |
| Lax | Any digits are read from the text (e.g. 'A2X3' -> 23). |
Missing Value Text
By default, empty cells or cells with a text consisting of a single asterisk * are read in as missing
values (unless the column is a text column, in which case the * is read in as a text value '*').
If an alternative string to * has been used as a missing value place holder in the worksheet
then entering this text here will cause this to be interpreted as a missing value in numerical
columns. For example, a user may have entered 'M' for a missing value or may have added a note
'Died' in a column of live weights.
Add to Book
This provides a list of books that the new
spreadsheet can be added to. The default book that spreadsheets are added to
can be specified using the
Spreadsheet Options - Books menu.
Action Buttons
| OK | Open the Excel file and close the dialog.
If any factor columns contain a high proportion of missing values you will be
prompted with a dialog that can be used to automatically fill the
empty factor cells. |
| Cancel | Close the dialog without opening the file. |
See Also