| Excel Import Wizard |
| See Also |
Select one or more of these and click Next or click Back to select a different Excel file. Multiple selections can be made by clicking using the Control and Shift keys with the mouse or by clicking on the Select All button to select all the worksheets and named ranges. If two or more worksheets or named ranges are selected then the wizard goes straight to step 5 and all the data is opened into a multi-paged book.
If multiple data ranges are to be loaded from the same worksheet, then the import wizard will to be used multiple times on the same file specifying the range of cells each time.
This step includes an option to add the data to a new or existing book book. The default book that spreadsheets are added to can be specified using the Spreadsheet Options - Books tab menu.
You can specify a range of cells by selecting the Specified Range option and entering the range in the space provided. When a range is entered the preview will change to display from the top left cell of the range.
Click Next to continue or click Back to select a different worksheet within the file.
By default, GenStat will not include any columns in the worksheet or cell range that do not contain any data. However, if empty columns are to be imported then the Keep Empty columns can be selected.
Click Next to continue or click Back to select a different range of cells within the worksheet.
These options specify the layout of the column names.
Read Column names from File
Controls whether column names are specified within the file.
| Yes if all labels | Read the column names from the row specified in the Column Names in Row option if all the cells in this row contain labels only or are missing. A default column name will be generated for a missing cell. |
| Yes | Use the cells specified in the Column Names in Row option . If a cell contains a number, then it will be prefixed with "_" to ensure it is a valid GenStat identifier name. |
| No | Do not read the column names from the file and generate default names for the columns. |
Column Descriptions in Row
When selected, the cells in the specified row number
will be used for column descriptions
(the EXTRA keyword for GenStat structures).
Row Numbers
Controls whether the row numbers are relative to the data range or relate to a whole worksheet.
| Relative | Row numbers start from the top of the restricted cell range. |
| Absolute | Row numbers start from the top of the worksheet and thus correspond to the absolute row numbering in Excel/Quattro. |
These options control the conversion of data to factors on loading.
Suggest Columns with only a few unique values to be factors
When selected, if columns contain few unique repeated values then you will be prompted
with a dialog where these columns can be converted to factors. The
Suggest converting columns with <= N unique items option within the
Column tab of the spreadsheet options can be used to control the limit for the number
of unique items.
Sort factor levels into alphabetical or numeric order
When selected, columns loaded as factors will have their levels
(or labels for a text column) sorted into ascending order.
Data contains Variates & factors only
When selected, any column containing text will be converted to either a variate
or factor. If the number of labels within the spreadsheet column
is less than 20% of the total number of rows, then the column will be
converted into a variate, otherwise the column will be converted into a factor.
Text to Number Conversions
This controls how text labels are interpreted as numbers when a label is
changed to a value when converting to a variate.
| Strict | Only labels that contain just numeric data are converted, otherwise they are set to a missing value (e.g. '10' -> 10, '1O' -> *.) |
| Single | Single characters and labels containing just numeric data are interpreted as numeric values, otherwise they are set to a missing value (e.g. o,O -> 0, i,I,l,L -> 1, s,S-> 2, z,Z -> 5, comma -> decimal point, '10' -> 10, 'Io' -> *). |
| Common | Multiple characters and labels containing just numeric data are interpreted as numeric values, otherwise they are set to a missing value (e.g. 'Io' -> 10, '23X' -> *). |
| Standard | Multiple characters and labels containing just numeric data are interpreted as numeric values, but text is truncated at the end of a number (e.g. '23X' -> 23, 'A2X3' -> *). |
| Lax | Any digits are read from the text (e.g. 'A2X3' -> 23). |
Additional information about conversions
Columns whose names are suffixed with a ! character will be automatically converted into factors. Similarly, if column names end with a $ or # character, then these will be converted into a text and variate respectively. As well as supplying a conversion character, column format information can be appended to a column name. For example, a suffix of :0, :1, :2 ... :9 specifies that the column should be displayed with the given fixed number of decimal places. A suffix of :D specifies the column is to be displayed in date format. This will use GenStat's default date format. Similarly, a suffix of :T specifies that the column is to be displayed in time format.
The order of factor levels and labels can be supplied within an Excel file by adding a comment to the cell containing the column name. Within the comment the factor levels must start with a ! and labels must start with !t on a new line. The levels and labels should then be entered within brackets, for example, for levels this could be !(100,90,50,10) and for labels this could be !T(Control,A,B,C) or !t('Control','A','B','C'). The order of the items in the comment will define the order of the levels or labels in the factor. If a column just contains ordinal values (i.e. 1...n), then the comment can still be used to assign labels or levels to these groups. The first item in the comment will define the level or label for group 1 in the factor and so on.
A column description can also be supplied in the comment as a separate line, but this must not start with an exclamation mark !, has # or dollar $ which are used for marking column types.
The following example shows two comments that have been entered into columns within Excel (using the Excel menu item Insert | Comment). Each column will be opened in GenStat as a factor using the defined levels/labels and will include a column description.
See Using GenStat with Excel for more information on setting up column types and comments.
This controls some general features available for the spreadsheet. Read Title from cell range
When selected, the cell range (e.g. A2:B3) or named range (which must be located on the same sheet as the data) specified in the space provided will be used to read text for the spreadsheet title. Note that text for the different cells within this range will be concatenated.
Set as Active Sheet
This sets the new spreadsheet that is created as the active spreadsheet.
These options control which rows of data are to be read into the new spreadsheet.
Drop empty rows from the spreadsheet
When selected, any rows that do not include any data are removed from the resulting spreadsheet.
Skip first X non-empty rows
When selected, the first specified number of rows will be excluded from the spreadsheet. This is useful to exclude any comments that made have been made in the spreadsheets. Note that text in these rows can still be used for column names or descriptions (see Column Names Tab).
| Back | Got back to the previous step to change the file, cells, columns selected, and any options set. |
| Next | Accept the options given and move on to the next step in reading in the data. |
| Finish | Use the specified information to read the data from the Excel file and close the dialog. If any factor columns contain a high proportion of missing values you will be prompted with a dialog that can be used to autonmatically fill the empty factor cells. |
| Cancel | Close the dialog without opening the file. |