Select Worksheet
See Also
A Microsoft Excel file or Quattro Windows (WQ1, WB1/2/3) file can contain several worksheets. This menu can be used to open one or more of these within GenStat. Worksheets and named ranges are listed, prefixed by S: and R: respectively. If a single worksheet or named range is to be opened you can specify a subrange of the data to be loaded.

If named ranges are specified within the file then these have the advantage that both the sheet and cell range are specified permanently in the Excel file. So, these are automatically updated if further rows or columns are inserted into the cell range.

Columns whose names are suffixed with a ! character will be automatically converted into factors. Similarly, if column names end with a $ or # character, then these will be converted into a text and variate respectively. As well as supplying a conversion character, column format information can be appended to a column name. For example, a suffix of :0, :1, :2 ... :9 specifies that the column should be displayed with the given fixed number of decimal places. A suffix of :D specifies the column is to be displayed in date format. This will use GenStat's default date format. Similarly, a suffix of :T specifies that the column is to be displayed in time format.

The order of factor levels and labels can be supplied within an Excel file by adding a comment to the cell containing the column name. Within the comment the factor levels must start with a ! and labels must start with !t on a new line. The levels and labels should then be entered within brackets, for example, for levels this could be !(100,90,50,10) and for labels this could be !T(Control,A,B,C) or !t('Control','A','B','C'). The order of the items in the comment will define the order of the levels or labels in the factor. If a column just contains ordinal values (i.e. 1...n), then the comment can still be used to assign labels or levels to these groups. The first item in the comment will define the level or label for group 1 in the factor and so on.

A column description can also be supplied in the comment as a separate line, but this must not start with an exclamation mark.

The following example shows two comments that have been entered into columns within Excel (using the Excel menu item Insert | Comment). Each column will be opened in GenStat as a factor using the defined levels/labels and will include a column description.

Select All

This can be used to select all the items within the list of worksheets and named ranges.

Restricted Cell Range

You can load a subset of the data from a worksheet by specifying the position of diagonally opposite corners of the rectangle that spans the required range. For example B12:E18 will load seven rows and five columns starting at row 12 of the second column.

Partial column or row information can be provided also:


Read Title from cell range

When selected, the cell range (e.g. A2:B3) or named range (which must be located on the same sheet as the data) specified in the space provided will be used to read text for the spreadsheet title. Note that text for the different cells within this range will be concatenated.

Select columns for Inclusion

Allows you to select a subset of columns from the worksheet. This opens the Select Columns From Worksheet dialog once the data has been read in.

Sort Factor Levels

When selected, columns loaded as factors will have their levels (or labels for a text column) sorted into ascending order.

Suggest Columns to be factors

When selected, if columns contain few unique repeated values then you will be prompted with a dialog where these columns can be converted to factors. The Suggest converting columns with <= N unique items option within the Column tab of the spreadsheet options can be used to control the limit for the number of unique items.

Drop Empty rows

When selected, any rows that do not include any data are removed from the resulting spreadsheet.

Keep Empty columns

By default, any columns that do not contain data are removed from the GenStat spreadsheet. However, these columns can be contained in the spreadsheet by selecting this option.

Data contains Variates & factors only

When selected, any column containing text will be converted to either a variate or factor. If the number of labels within the spreadsheet column is less than 20% of the total number of rows, then the column will be converted into a variate, otherwise the column will be converted into a factor.

Skip first X non-empty rows

When selected, the first specified number of rows will be excluded from the spreadsheet. This is useful to exclude any comments that made have been made in the spreadsheets. Note that text in these rows can still be used for column names or descriptions.

Read Column names from File

Controls whether column names are specified within the file.
Yes if all labels Read the column names from the row specified in the Column Names Row option if all the cells in this row contain labels only or are missing. A default column name will be generated for a missing cell.
YesUse the cells specified in the Column Names Row option . If a cell contains a number, then it will be prefixed with "_" to ensure it is a valid GenStat identifier name.
NoDo not read the column names from the file and generate default names for the columns.

Column Names in Row

Specifies the row number to use for column names.

Column Descriptions in Row

When selected, the cells in the specified row number will be used for column descriptions (the EXTRA keyword for GenStat structures).

Row Numbers

Controls whether the row numbers are relative to the data range or relate to a whole worksheet.
RelativeRow numbers start from the top of the restricted cell range.
AbsoluteRow numbers start from the top of the worksheet and thus correspond to the absolute row numbering in Excel/Quattro.

Text to Number

This controls how text labels are interpreted as numbers when a label is changed to a value when converting to a variate.
StrictOnly labels that contain just numeric data are converted, otherwise they are set to a missing value (e.g. '10' -> 10, '1O' -> *.)
SingleSingle characters and labels containing just numeric data are interpreted as numeric values, otherwise they are set to a missing value (e.g. o,O -> 0, i,I,l,L -> 1, s,S-> 2, z,Z -> 5, comma -> decimal point, '10' -> 10, 'Io' -> *).
CommonMultiple characters and labels containing just numeric data are interpreted as numeric values, otherwise they are set to a missing value (e.g. 'Io' -> 10, '23X' -> *).
StandardMultiple characters and labels containing just numeric data are interpreted as numeric values, but text is truncated at the end of a number (e.g. '23X' -> 23, 'A2X3' -> *).
LaxAny digits are read from the text (e.g. 'A2X3' -> 23).

Missing Value Text

By default, empty cells or cells with a text consisting of a single asterisk * are read in as missing values (unless the column is a text column, in which case the * is read in as a text value '*'). If an alternative string to * has been used as a missing value place holder in the worksheet then entering this text here will cause this to be interpreted as a missing value in numerical columns. For example, a user may have entered 'M' for a missing value or may have added a note 'Died' in a column of live weights.

Add to Book

This provides a list of books that the new spreadsheet can be added to. The default book that spreadsheets are added to can be specified using the Spreadsheet Options - Books menu.

Action Buttons

OKOpen the Excel file and close the dialog. If any factor columns contain a high proportion of missing values you will be prompted with a dialog that can be used to automatically fill the empty factor cells.
CancelClose the dialog without opening the file.

See Also