SPLINE procedure
Calculates a set of basis functions for M-, B- or I-splines (P.W. Goedhart).
Options
Parameters
Description
Piecewise polynomials or splines can be used for nonparametric function estimation. Splines offer a flexible way to investigate the shape of a relationship or can be used for interpolation and smoothing. There are several types of splines. Smoothing splines, implemented in GenStat by means of regression function SSPLINE, minimize a penalized residual sums of squares in which lack of smoothness of the estimated function is penalized. Smoothing splines can be less appropriate when local effects are strong or when the estimated function should be monotone, e.g. when estimating growth curves.
An alternative for smoothing splines is to use regression splines which offer more control over the characteristics of the estimated function. With regression splines the user first specifies an interval [L, U) on which the estimated function is non-trivial. This interval is then explicitly divided into segments by the user, and a polynomial, of order say k, is fitted in each segment. The segments are separated by a sequence of so-called knots. It is customary to force the piecewise polynomials to join smoothly at these knots. The piecewise polynomials and all their derivatives are always continuous from the right at the knots. Moreover, when there are no replicated knot values, the (k-1)th derivative is continuous at the knot values. The order of differentiability is lower when there are replicated knot values. The full knot sequence includes the endpoints L and U which are replicated depending on the order of the piecewise polynomial. Ramsay (1988) provides a concise introduction into regression splines, while de Boor (1978) gives a full account.
The SPLINE procedure can be used to calculate a set of so called basis functions which have all the required properties of continuity and differentiability. These basis functions can then be used to fit the regression spline. A simple basis is given by truncated polynomials but this has the disadvantage of generating considerable rounding errors. A numerically superior basis is provided by M-splines. Their main features are that any basis function is positive in a series of consecutive segments, is zero elsewhere and is normalized by having unit area. An alternative normalization is provided by B-splines which have the property that the sum over all basis functions is 1 for values in the interval [L, U). Basis functions of M-splines and B-splines are linearly related and are 0 outside [L, U). The resulting piecewise polynomial is discontinuous at the endpoints L and U.
Monotonicity of the estimating function can be imposed by employing a basis consisting of monotone functions. Ramsay (1988) uses integrated M-splines which, when combined with nonnegative regression coefficients, yield a monotone spline. These integrated M-splines are called I-splines. The basis functions for I-splines are not linearly related and they are 0 for values smaller than L and 1 for values greater than or equal to U. The resulting piecewise polynomial is continuous but not differentiable at the endpoints. The choice of the polynomial order and of the knot values are crucial for successful usage of regression splines. Wegman & Wright (1983) summarize practical recommendations for M-splines, while Ramsay (1988) does so for I-splines. In general the knots should be chosen in regions where the relationship changes most markedly. A useful preliminary knot placement is to position a single interior knot at the median, two interior knots at the terciles, three at the quartiles, and so on. The order of the piecewice polynomials is usually taken to be 2 or 3.
The values for which the basis functions are calculated must be specified by the X parameter. The values of the basis functions are saved with the BASIS parameter, while the first order derivatives of the basis functions can be saved by setting the DBASIS parameter. The BASIS and DBASIS pointers are redefined in the procedure. If a value in the X parameter coincides with an interior knot and the basis function or its first order derivative has a discontinuity at that value, it should be remembered that the functions are continuous and differentiable from the right.
The interior knot sequence must be set with the KNOTS option and the ORDER option can be used to specify the order of the piecewise polynomials. The TYPE option determines which spline basis is calculated. The interval [L, U) for which the basis functions are non-trivial can be specified by the LOWER and UPPER options. If these are unset the following values are used:
CALCULATE LOWER = MINIMUM(X)
CALCULATE max = MAXIMUM(X)
CALCULATE UPPER = max + ((max.EQ.0) + ABS(max))/500000
In this case the UPPER value is such that max is just in the interval [L, U). The NOMESSAGE option can be used to suppress warning messages which are printed when the KNOTS variate has replicated values and when the interval [L, U) does not overlap the range of X values.
Options: KNOTS, ORDER, TYPE, LOWER, UPPER, NOMESSAGE.
Parameters: X, BASIS, DBASIS.
Method
Basis functions for M-splines are calculated by a recurrence relation from Ramsay (1988). These basis functions are multiplied to give B-splines or summed to provide I-splines. Note that unlike Ramsay (1988), the order of the spline is here defined as the order of the piecewise polynomial.
Action with
RESTRICT
The variates contained in the BASIS and DBASIS pointers are restricted in the same way as the X parameter. Values in the units excluded by the restriction are set to missing. Restrictions on the KNOTS variate are ignored.
References
de Boor, C. (1978). A Practical Guide to Splines. Springer-Verlag. New York.
Ramsay, J.O. (1988). Monotone regression splines in action (with discussion). Statistical Science, 3, 425-441.
Wegman, E.J. & Wright, I.W. (1983). Splines in statistics. Journal of the American Statistical Association, 78, 351-365.