Statistica Visual Basic Library of Matrix Functions - LOWESS (VectorX, VectorY, SmoothF, Nsteps, Delta, VectorXSorted, VectorYSmooth, VectorRWeights, VectorResiduals)

Parameter

Description

VectorX

name of input vector X (the function will smooth the data in input VectorY; see also Arrays in functions); the data in VectorX will not be changed by this function

VectorY

name of input vector Y (the function returns the smoothed datapoints in VectorYSmooth; see also Arrays in functions); the data in VectorY will not be changed by this function

SmoothF

input parameter; specify the amount of smoothing; SmoothF determines the fraction of points from which each smoothed value is computed; thus, smaller values for SmoothF will produce smoothed y values that follow the observed data more closely; SmoothF should be chosen in the range between .2 and .8 (Cleveland, 1979)

Nsteps

input parameter; specify the number of iterations for the robust fit; if Nsteps is equal to 0, then a non-robust fit is computed; Cleveland (1979) recommends that 2 iterations are adequate for almost all situations

Delta

input parameter, specify the interval width for which to perform the locally weighted regression computations; Delta must be non-negative; and should usually be set to 0 (perform computations for each point); for large data sets the smoothing computations can be made more efficient by introducing implicit aggregation; see the section on Delta below

VectorXSorted

name of the output vector containing the x values sorted into ascending order

VectorYSmooth

name of the output vector containing the smoothed y values, sorted in the same order as the values in VectorXSorted

VectorRWeights

name of the output vector containing the robustness weights; if Nsteps=0, then VectorRWeights is not used.

VectorResiduals

name of the output vector containing the residual values (original y values minus the smoothed y values)

The LOWESS function will perform robust locally weighted regression and smoothing for 2D scatterplot data (pairs of x and y data). This method is described in detail in Cleveland (1979, 1985).  The fitted values are computed by using the nearest neighbor method and robust locally weighted regression.

Parameter Delta. Parameter Delta is useful when the datafile is very large. If Delta is greater than 0, then the (locally) weighted regression will not be performed at each individual observation or value of x; instead, the computations are performed only at larger intervals approximately of size Delta. Specifically, if the weighted regression computations were performed for a point xi, then the next point is chosen so that it is either (a) the adjacent point (in the sorted array VectorXSorted) if it is greater than xi+Delta, or (b) the largest point xj that falls inside the interval xi+Delta; the values for any intermediate values of x are found through linear interpolation. For very large data sets, substantial savings in computation time can be gained through this method.

Missing data. If the input vectors VectorX and VectorY contain missing data values, then the respective x-y pairs of data will be ignored.  Therefore, the output vectors (VectorXSorted, VectorYSmooth, VectorRWeights, VectorResiduals) may contain fewer valid data points than the input vectors.

For more information on using arrays, see Arrays in functions.

For a complete list of matrix functions, see Statistica Visual Basic library of matrix functions.