VSN International / Home Pages / Vision No 21

Vision No 21

Free software in education

VSNi is proud to announce the launch of GenStat for Teaching and Learning – a new free product based on the current Genstat 13, but amended to make it suitable for school and undergraduate students. The easy to use GenStat Menu system provides access to a wide range of standard statistical analyses, including: basic statistics, statistical tests (t, chi-square, nonparametric tests etc), regression (general models not just e.g. simple linear regression), generalized linear models, nonlinear models (standard curves and user-defined), analysis of variance, design of experiments and sample size, REML analysis of linear mixed models, multivariate analysis, six-sigma, survival analysis, time series and repeated measurements.

For full details of the key differences between GenStat for Teaching and Learning and the full commercial edition GenStat 13 please visit the website.

GenStat for Teaching and Learning is licenced at an institutional level, but individual trials are available by emailing support.

For researchers working with complex models and larger data sets, the Discovery Programme has been expanded to include ASReml Discovery. Recently launched ASReml Discovery is available to everyone involved in education, and also to non-commercial research organisations based in eligible countries. More details on ASReml Discovery and how to register for a copy can be found on the VSNI website.

Technical Tip – bootstrapping in GenStat

Bootstrapping is now a popular technique for estimating the variability of estimates from a given set of data while making no assumptions about the distribution of the data. The main assumption in bootstrapping is that the observations are exchangeable. In simple circumstances this means exactly that: so, under the “null hypothesis” of no differences between the treatments, the observed data values could have been allocated to any of the observed units. To bootstrap a data set, we draw a new set of observations from the exchangeable units by sampling with replacement from the original units. In each bootstrap sample, the original observations may be omitted or sampled one or more times.

With the launch of the GenStat Teaching and Learning Edition comes a new menu for bootstrapping, which will also be available in the 14th Edition later this year.

We can illustrate the use of the menu with data in the GenStat example spreadsheet Cane.gsh (in the GenStat Data folder). This contains yields of sugar cane under a range of levels of nitrogen fertilizer. The menu is shown in Figure 1.

Figure 1

menu

We need to select the statistics to bootstrap from the choices in the drop-down list at the top of the menu; further boxes then appear to specify the data needed for that analysis. Here we have chosen to look at a linear regression between sugar yield and nitrogen. Running the analysis with the default options produces graphs showing the distributions of slopes and intercepts from the bootstrap resampling.

Figure 2

distribution of slope

Figure 2 shows the distribution of the slope, and you can see that a slope of zero is outside the 95% confidence interval. So we can be confident that there is a relationship between sugar yield and the amount of nitrogen fertilizer. The menu produces the output below.

output

Other statistics available in the menu include means, medians, differences between means and medians, t-statistics and correlations.

The menu uses the BOOTSTRAP procedure, and you can look at the Input log to see how it does this if you want to use the procedure for more complicated analyses.

Input log

BOOTSTRAP needs a procedure RESAMPLE to do the analysis and calculate the statistics to be studied. This has a DATA option which passes a pointer containing the resampled data into the procedure. The STATISTICS parameter lists the estimates of the various for this sample, and the EXIT parameter returns a value of 1 if the estimation has failed and 0 otherwise.

The DATA pointer for RESAMPLE is created from the data structures that are listed for the DATA option of BOOTSTRAP. So DATA[1] contains the y-values (Yield), and DATA[2] contains the x-values (Nitrogen). The parameter of BOOTSTRAP lists names to be used to label the statistics in the output.

Figure 3

Standard Curve menu

Figure 4

Standard Curve Save dialogue

We can modify RESAMPLE if we want to use this as a template for another analysis, for example, an exponential regression. The easiest way to form the commands is to run the Standard Curves menu (under Stats | Regression Analysis) shown in Figure 3, and then set up the Save dialog (click the Save button) to save the estimated coefficients of the curve, as shown in Figure 4. After running this the Input log contains the following commands:

Input log

We can cut and paste this into the RESAMPLE procedure to bootstrap the analysis. We create a new text window (Ctrl+N), cut and paste the original RESAMPLE code to this window, and then paste the code above to replace the linear regression code. We need to make the following changes to exponential regression code we pasted in: change Yield to DATA[1] and Nitrogen to DATA[2], set the PRINT option of FITCURVE to * (unless you want output and graphs from the 100 bootstrap datasets!), return the 3 estimates in STATISTICS, and finally rename the parameters in the BOOTSTRAP statement to give the following amended program (with some additional editing to tidy up the code):

output

Running this program will now give bootstrap confidences limits for the exponential regression parameters. Figure 5 shows the bootstrap distribution of the parameter R.

Figure 5

Bootstrap distribution of R

So the new menu not only makes bootstrapping a straightforward process for many standard analyses, it also helps you to use the bootstrap for more complicated analyses. For more information you can either look at the on-line help for the Bootstrap menu, or the description of the BOOTSTRAP procedure in the GenStat Reference Manual 3 Procedure Library which can be opened by selecting the Procedure Library sub-option of the Reference Manual option of the Help menu on the menu bar.

Latest training courses

An Introduction to ANOVA and Design in GenStat is scheduled for 28th and 29th July in Pretoria, South Africa. More details can be found on the VSNi website.

Our training schedule is always being updated according to user requests and requirements, so do email support with suggestions and any specific requirements you have and check our website for new courses.

Out and about with VSNi

The 2011 events we are attending or supporting include: SUSAN at the University of Botswana, Gaborone, Botswana from 27 June – 1 July 2011, the Australian Applied Statistics Conference in Palm Cove, Tropical North Queensland from 12-15 July 2011. 58th ISI World Statistics Congress in Dublin, Ireland, from 21- 26 August 2011.

If you would like to meet with the VSNi staff at one of these events please email support to arrange the details.

We’re always updating the list of events we can support and sponsor – so please send us details of any events you are organising or involved in, and as we decide on more events for the future we’ll list them on our website.

Other News from VSNi

Success for the GenStat sponsored Holy Trinity Mariners U9′s football team! After a disappointing 2-0 defeat in the cup final, the Mariners returned to form with a resounding 4-0 victory in their last game of the season making them winners of the Chiltern Church U9′s League. Well done Mariners!

Discussion

Comments are disallowed for this post.

Comments are closed.