Import Data from Excel
See Also

Data can be imported from Microsoft Excel in several different ways.

Using the Clipboard

In Excel a block selection of data can be transferred to the clipboard using the mouse or SHIFT and arrow cursor keys followed by selecting Copy from the Edit menu (or use Ctrl+C). Within GenStat (Alt+Tab) you can paste the block of data directly into a spreadsheet by selecting the from Clipboard item on the Spread menu. If the first row in the selected block of data within Excel contains the column names, then these will be used as column names within the GenStat spreadsheet. If you want to ensure that a column is copied as a factor then you can append an exclamation mark to the end of the column name.

One disadvantage of importing the data using the clipboard, is that the precision of the any numbers may be lost if they have been displayed to a reduced number of decimal places within Excel. For example, if the number 3.14159 is displayed to 1 decimal place within Excel (3.1) then this will be copied into GenStat as 3.1. Also any dates will be copied as text columns if the date format contains letters or separators (e.g. 21-Mar-2005 or 21/3/2005).

Using the Excel Import Wizard

The Excel Import Wizard provides a step by step guide through importing data from Excel into a GenStat. The advantage of the wizard is that you can go back at any point to select a new file or change some of the options. The first menu within the wizard is a standard File open dialog from which you can select the file. After selecting the file a second menu appears where you can select the name of the worksheet or named range that contains the data. Clicking next will take you to the next step where you can see a preview of the data. This menu can also be used to extract a subset of the data by specifying cell range. Clicking next takes you to the next step where you can choose which columns to include from your block of data. The final step allows to you set specific options for column names, factors, titles and rows.

Using File Open

By default an Excel file is opened using the Excel import wizard. However, you can use an alternative Select Worksheet menu containing a subset of the options provided in the wizard. To use this menu you can turn off the Use Excel Import Wizard on File Open option within the Spreadsheet Options - File menu.

Opening Excel Files

An advantage of importing data by opening the file directly is that Excel does not have to be open or even installed on your PC. The full precision of the data is always imported and there are more options for controlling how data are handled as compared to copying from the clipboard.

When data are opened directly from an Excel file you can include comments and append codes to column names within Excel to specify how the data is to be imported. The following codes can be appended to column names to format columns:

!Imports column as a factor
$Imports column as a text
#Imports column as a variate
:0, :1, :2 ... :9The fixed number of decimal places to display values within column
:EThe column will be displayed using scientific format.
:DThe column will be displayed using the default GenStat date format.
:TThe column will be displayed using a time format.

You can supply factor levels or labels and their order by supplying them within a comment to the column name cell in Excel. Within the comment the factor labels or levels must be contained within the characters !( ) or !T( ) on a new line in the comment. For example, to supply the levels 100, 90, 50 and 10, enter the following format !(100,90,50,10). To supply labels Control, A and B you can use either !T(Control,A,B) or !t('Control','A','B'). The order of the items in the comment will define the order of the levels or labels in the factor. If a column in Excel just contains ordinal values (i.e. 1...n), the comment can still be used to assign labels or levels to these groups. In this case the first item in the comment will define the level or label for group 1 in the factor, and so on.

The ! marker can be applied in conjunction with the Date :D or Time :T markers, so that !:D would indicate a factor with levels displayed as dates.

Cell Comments

Comments on the cells containing the column names can be used to transfer extra information to GenStat. The comments can contain the following information:
  1. The factor labels or levels and their ordering
  2. The column type (!,#,$ etc as above)
  3. The column description
The column type (! for factor, # for variate or $ for text) and display format (:D etc) can be given at the start of the comment as described above.

Factor labels should be supplied in the usual GenStat form, starting with !T( with the entries separated by commas, and terminated with a closing bracket ')'. If any of the labels contain one or more of the characters '!', ',' or ')' - exclamation mark, comma or left bracket, then the label must be contained in single quotes. Note that if a label contains a single quote this must be doubled up. For example, the string (Pierre David's) should be supplied as '(Pierre David''s)'. If labels do not include any of the reserved characters then you do not need to use single quote's. For example, the following could be used !T(A,B,C,'(D-E)'). Any numbers within the list will be interpreted as a string provided the !T is given at the start of the label list.

Factor levels should start with !( where the numbers are separated by commas and terminated with a closing bracket ). If any of the entries in the list is not a valid number, then the list will be interpreted as the labels of the factor, equivalent to starting the list with !T(.

Note the type marker ! is redundant if the labels or levels are given.

The column extra description can be given as a plain string of text. This should not start with one of the reserved characters '!', '$', '#' or ':'.

You can combine the factor label/level specification and an extra description into a single comment. These must be specified in the following order:r: factor levels or labels, followed by a new line (use Shift+Enter to enter this into the comment in Excel), followed by the column type/display symbols, followed by the column description. The following gives an example of a factor being given labels, but with the levels in date format with a description:
!T('Spring','Summer','Autumn','Winter')
:D Seasonal Summary

If factor values are supplied as ordinals (1,2,3...), then levels or labels can be used to label these within GenStat. For example, a column called Sex could have the levels (1,2) and labels !T(Male,Female) applied to it. Also, if the data for the factor are specified as levels e.g. (10,20,40,100), then the comment can be used to apply labels to the levels, so a comment of !T('Half','Standard','Double','High') could give these labels to the 4 groups.

If the factor has levels or labels in the column data, the cell comment can be used to specify the order that these given in the factor, so that a comment of !T(Nil,Wild,AR1,AR37) can be used to change the label ordering from the default alphabetical ordering of AR1,AR37,Nil,Wild. Similarly if levels are provided with the column, the a redorering such as!(10,5,2,1,0.5) could be used to specify the levels should be stored in decreasing order rather than ascending order as by default.

An example of an Excel cell comment is shown below.

and another below shows two comments that have been entered into Excel (using the Comment item from the Insert menu within Excel).

More information on this can be found in Using GenStat with Excel.

Using Dynamic Data Exchange

You can create a Dynamic Data Exchange (DDE) link to an open file within Excel using the GenStat spreadsheet. A link can be set up by selecting the DDE Link item from the New section on the Spread menu. A DDE link can be specified in one or two ways, so that any changes in Excel are automatically sent to GenStat or vice-versa.

The options for controlling how the data are imported in are much more limited using a DDE link as opposed to opening the file directly or copying using the clipboard. However, if a DDE link is saved to a GenStat DDE link file (*.GDE), then the latest version of the data will always be read, avoiding problems with maintaining two versions of the data.

Using ODBC

You can access data within an Excel file by specifying an ODBC Data Query connection to an Excel file using the Microsoft Excel (*.xls) ODBC driver. This facility can be used to form SQL queries to import from a single worksheet or merge data from multiple worksheets from the Excel file. The ODBC driver treats each worksheet as a table in a database and uses the labels in the first row as the column names.

Using GenStat Commands

Data can be imported directly into the GenStat server using the IMPORT procedure and written to an Excel file using the EXPORT procedure. The DDEIMPORT and DDEEXPORT procedures allow data to be dynamically read and written to open Excel files. The DBIMPORT and DBEXPORT procedures can be used to import data from Excel using ODBC.

Saving to an Excel File

Data can be saved back to an Excel file using the Save or Save As items on the File menu. Using the Save item from the File menu will overwrite an Excel file, however, you can add a new worksheet to an existing Excel file using the Add to Excel file item on the Sheet section of the Spread menu.

See Also

Using GenStat with Excel
Load Data From Files
Add Data to a Spreadsheet
Excel Import Wizard
Setting up Excel Named Ranges
ODBC Databases
Create Spreadsheet Using a DDE Link
Create a New Spreadsheet
File New - Spreadsheet
New Spreadsheet from Clipboard
Options - Date Format
Setting an Active Spreadsheet

The IMPORT procedure reads data from Excel within the command language. The EXPORT procedure saves data to Excel files within the command language. The DDEIMPORT procedure reads data from open Excel files within the command language. The DDEEXPORT procedure writes data to open Excel files within the command language.