All pastes #1255734 Raw Edit

lowess.py

public python v1 · immutable
#1255734 ·published 2008-11-14 01:21 UTC
rendered paste body
"""This module implements the Lowess function for nonparametric regression.Functions:lowess        Fit a smooth nonparametric regression curve to a scatterplot.For more information, seeWilliam S. Cleveland: "Robust locally weighted regression and smoothingscatterplots", Journal of the American Statistical Association, December 1979,volume 74, number 368, pp. 829-836.William S. Cleveland and Susan J. Devlin: "Locally weighted regression: Anapproach to regression analysis by local fitting", Journal of the AmericanStatistical Association, September 1988, volume 83, number 403, pp. 596-610."""import numpytry:    from Bio.Cluster import median    # The function median in Bio.Cluster is faster than the function median    # in NumPy, as it does not require a full sort.except ImportError, x:    # Use the median function in NumPy if Bio.Cluster is not available    from numpy import mediandef lowess(x, y, f=2./3., iter=3):    """lowess(x, y, f=2./3., iter=3) -> yestLowess smoother: Robust locally weighted regression.The lowess function fits a nonparametric regression curve to a scatterplot.The arrays x and y contain an equal number of elements; each pair(x[i], y[i]) defines a data point in the scatterplot. The function returnsthe estimated (smooth) values of y.The smoothing span is given by f. A larger value for f will result in asmoother curve. The number of robustifying iterations is given by iter. Thefunction will run faster with a smaller number of iterations."""    n = len(x)    r = int(numpy.ceil(f*n))    h = [numpy.sort(numpy.abs(x-x[i]))[r] for i in range(n)]    w = numpy.clip(numpy.abs(([x]-numpy.transpose([x]))/h),0.0,1.0)    w = 1-w*w*w    w = w*w*w    yest = numpy.zeros(n)    delta = numpy.ones(n)    for iteration in range(iter):        for i in range(n):            weights = delta * w[:,i]            theta = weights*x            b_top = sum(weights*y)            b_bot = sum(theta*y)            a = sum(weights)            b = sum(theta)            d = sum(theta*x)            yest[i] = (d*b_top-b*b_bot+(a*b_bot-b*b_top)*x[i])/(a*d-b**2)        residuals = y-yest        s = numpy.median(abs(residuals))        delta = numpy.clip(residuals/(6*s),-1,1)        delta = 1-delta*delta        delta = delta*delta    return yest