This dialog can be used to append data from multiple sheets within a single Excel
file into a single spreadsheet. The new spreadsheet created from the appended data includes
a factor, with the column name Sheet, that indexes the worksheet source from where the
data were opened from. The name of each appended worksheet is used for the labels for each group of this
|Append Multiple Worksheets from an Excel file|
To append data from multiple Excel files into one spreadsheet, the
Append Multiple Files menu can be used.
Select one or more sheet names to read the data from. By default only one block
of data can be opened from a worksheet. However, you can open multiple blocks
of data from a single worksheet by creating named ranges. Named ranges have the advantage that both the sheet and cell range are specified
permanently in the Excel file, and these are automatically updated if further
rows or columns are inserted into the cell range. See
Setting up Excel Named Ranges for details on creating a named range in Excel.
Worksheets and named ranges are identified in the list by the prefix S: and R: respectively.
By default, if the first row loaded contains text values
these will be used as column names in each spreadsheet, otherwise
default column names will be generated (this behaviour can be modified using
the options below).
Column names ending in an exclamation mark (!) will have this
character removed and the column will be loaded as a factor.
Likewise names ending with a dollar ($) or a hash (#) will be
loaded as text or variates respectively.
The initial format of data can be specified by appending characters to the
end of the column name.
Characters in column names that are invalid for GenStat identifier names are
converted to underscore (_).
- A suffix of :0, :1, :2 ... :9 specifies the column is displayed with the
given fixed number of decimal places.
- A suffix of :E specifies the column is to be displayed in scientific format.
- A suffix of :D specifies the column is to be displayed in date format.
The actual format to display the date will be the default date format for GenStat.
- A suffix of :T specifies the column is to be displayed in time format.
Factor levels, labels and their order can be supplied in Excel
by adding a comment to the cell containing the column name. The factor
labels or levels must start with a ! on a new line in the comment.
To provide levels use the format !(100,90,50,10). Alternatively,
to provide labels use the format !T(Control,A,B,C) or !t('Control','A','B','C').
The order of the items in the comment will define the order of the levels or labels
in the factor. If a column in Excel just contains ordinal values (i.e. 1...n), the
comment can still be used to assign labels or levels to these groups. The first
item in the comment will define the level or label for group 1 in the factor etc.
Column description information can also be supplied in the comment, and is specified by
any line of text not starting with an exclamation mark.
The following image shows two comments entered into Excel (use the Excel menu item
Insert | Comment to do this, making sure you
do not put any line breaks in the middle of the list).
More information on this can be found in Using GenStat with Excel.
Restricted Cell Range
If specified, the same cell range is used on each sheet as within the Excel file.
However, you can load a subset of the data from a worksheet by specifying the
position of diagonally opposite corners of the rectangle that spans the required
range. For example, the range B12:E18 will load seven rows and five columns starting
at row 12 of the second column (B).
The range can also be specified by using partial column or row information as follows:
- Just the starting cell can be given (e.g. B12) and all cells
to the lower right will be used.
- Just column letters provided (e.g. B:E), and all rows in these
columns will be used.
- Just row numbers provided (e.g. 12:18), and all columns with
data in these rows will be used.
Select columns for Inclusion
Allows you to select a subset of columns from the worksheet.
This opens the Select Columns From Worksheet dialog
once the data has been read in.
Sort Factor Levels
When selected, columns loaded as factors will have their levels
(or labels for a text column) sorted into ascending order.
Suggest Columns to be factors
Prompts to convert columns to factors where the columns have repeated values,
and fewer unique values than the specified number given in the
Suggest converting columns with <= N unique items option within
the spreadsheet options.
Drop Empty rows
Remove any rows from the spreadsheet that only contain missing data.
Keep Empty columns
Keep any empty columns from the spreadsheet. The default behaviour is to
only include columns that contain data.
Data contains Variates & factors only
Any column containing text will be converted to either a variate
or factor. If the number of rows containing labels within the spreadsheet
column is less than 20%, the column will be converted into a variate, otherwise
the column will be converted into a factor. When reading the column as
a variate, common typing errors such as entering a letter O for the number 0
(or I for 1) will be fixed.
Skip first X non empty rows
Skip the first specified number of rows before reading the data.
Often at the start of a spreadsheet there are some comments
and this option can be used to skip over these. Labels in these
rows can still be used in either the names or extra components
of a column.
Read Column names from File
Specifies whether to read the column names from the file.
|Yes if all labels||The names will be read from the row specified
in the names row (below) if all the cells in this row contain only labels or
are missing. A default column name will be generated for a missing cell.
||Yes||Use the cells in the name row for column names. If the cell contains
a number prefix this with a "_" to make it a valid GenStat name.
||No||Do not read the names from the file and generate all default
For more details on column names see the Select Worksheet section
Column Names in Row
The row number (within the selected range) to use for column names.
A row number within the skipped rows may be used if required.
Column Descriptions in Row
If this option is checked, the cells in the specified row number
(within the selected range) will be used for column descriptions
(the EXTRA keyword for GenStat structures).
A row number within the skipped rows may be used if required.
|Relative||- Taken from the top of the restricted cell range.|
|Absolute||- Taken from the top of the spreadsheet and thus correspond to
the absolute row numbering in Excel/Quattro.|
Text to Number Conversions
This controls how labels are interpreted as numbers when a label is forced
to be an entry in a variate:
|Strict||- only labels that contain numeric data only are converted
(e.g. '10' -> 10, '1O' -> *.)
||Single||- a single character substitution is read as a number
(o,O -> 0, i,I,l,L -> 1, s,S-> 2, z,Z -> 5, comma -> decimal point)
(e.g. '1O' -> 10, 'Io' -> *).
||Common|| - multiple substitutions as in single are made
(e.g. 'Io' -> 10, '23X' -> *).
||Standard|| - as in common but extra text is ignored at the end
of the number (e.g. '23X' -> 23, 'A2X3' -> *).
||Lax|| - any digits are read from the text (e.g. 'A2X3' -> 23).
Missing Value Text
By default, empty cells or cells with a text consisting of a single asterisk * are read in as missing
values (unless the column is a text column, in which case the * is read in as a text value '*').
If an alternative string to * has been used as a missing value place holder in the worksheet
then entering this text here will cause this to be interpreted as a missing value in numerical
columns. For example, a user may have entered 'M' for a missing value or may have added a note
'Died' in a column of live weights.
Match columns by
|Position||- Data from column 1 will be appended to column 1, 2 to 2 etc.|
|Name||- Data from columns with the same name will be appended to each other.|
Add to Book
This lists all open books within GenStat. Select the book that the new sheet will
be added to. If the data are to appear in a new book then select the
New Book setting.
Check for Dates
When selected, GenStat checks all text columns to see if they contain data in date format.
If columns appear to contain data in date format you are prompted to convert these to dates.
Set as Active Sheet
This sets the new spreadsheet that is created as the Active spreadsheet.
See Setting an Active Spreadsheet for more details.
|OK||Append the sheets and close the dialog.|
|Cancel||Close the menu without further changes.|
|Select All||Select all the sheets in the Select Worksheet list.|
When the Spread | New | Append Multiple Excel Worksheets menu item is selected
you will be prompted for the name of an Excel file as shown below.
After clicking Open another dialog will appear listing all the sheets and named
ranges found within the Excel file. Within this dialog select the sheets and
named ranges you want to append and choose any additional options. The dialog below shows
that the Excel file contains 11 sheets and these have been all been selected to be
appended within a single spreadsheet. Clicking OK will open the 11 sheets and append
them into a GenStat spreadsheet.
The APPEND procedure can be used in the command language to
append data sets.