BAFFYMETRIX procedure

Estimates expression values from an Affymetrix CED and CDF file, PC Windows only (D.B. Baird).


Options

METHOD = string
Method for calculating probe expression values (mas4, mas5, rma, rma2); default rma

TRANSFORMATION = string
How to transform the data (log2, none); default none when METHOD=mas4, otherwise log2


Parameters

CELFILES = texts
Affymetrix CEL files

CDFFILE = texts
Associated CDF file

GSHFILE = texts
GenStat spreadsheet file containing the estimated expression values, together with the associated slide and probe information


Description

BAFFYMETRIX estimates expression values for Affymetrix data. It operates in a "batch" mode, in which each set of CEL files and associated CDF file are loaded into the server, and processed automatically to generate a summary spreadsheet containing the estimates together with the associated slide and probe information.

   The METHOD option selects the method to use to summarize over the PM and MM pairs, with settings:

    rma
Robust Means Analysis model - the probe level model introduced by Irizarry et al. (2003) which only uses PM information and transforms the values based on a kernel density estimate of the PM distribution;

    rma2
Robust Means Analysis 2 - an adaptation of RMA algorithm which fits the kernel density to a truncated distribution of the PM values, with the truncation point based on an initial kernel density estimate;

    mas4
Affymetrix Version 4 - the AvDiff algorithm introduced in the Affymetrix version 4 software; and

    mas5
Affymetrix Version 5 - the Tukey biweight algorithm introduced in the Affymetrix version 5 software.

In the Affymetrix MAS 4 and 5 methods, the difference between the signals (PM - MM) is averaged using a robust averaging method. The MAS 4 algorithm uses the AvDiff algorithm which discards the minimum and maximum difference, and any differences greater than 3 standard deviations from the mean. The MAS 5 algorithm uses the Tukey biweight algorithm which reweights the values depending on how far they are from the median, and discards any that are more than 5 times the median absolute distance away. The MAS 5 algorithm also replaces the MM value with a value known as an Ideal Mismatch (IM), which is always less than the PM value.

   The TRANSFORMATION option controls whether the PM and MM values are transformed to logarithms base 2. The default does the transformation only for METHOD = mas5, rma or rma2.

 

Options: METHOD, TRANSFORMATION.

Parameters: CELFILES, CDFFILE, GSHFILE.


Reference

Irizarry, R.A., Hobbs, B., Collin, F., Beazer-Barclay, Y.D., Antonellis, K.J., Scherf, U. & Speed, T.P. (2003). Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics, 4, Number 2, 249-264.