1

المرجع الالكتروني للمعلوماتية

تاريخ الرياضيات

الاعداد و نظريتها

تاريخ التحليل

تار يخ الجبر

الهندسة و التبلوجي

الرياضيات في الحضارات المختلفة

العربية

اليونانية

البابلية

الصينية

المايا

المصرية

الهندية

الرياضيات المتقطعة

المنطق

اسس الرياضيات

فلسفة الرياضيات

مواضيع عامة في المنطق

الجبر

الجبر الخطي

الجبر المجرد

الجبر البولياني

مواضيع عامة في الجبر

الضبابية

نظرية المجموعات

نظرية الزمر

نظرية الحلقات والحقول

نظرية الاعداد

نظرية الفئات

حساب المتجهات

المتتاليات-المتسلسلات

المصفوفات و نظريتها

المثلثات

الهندسة

الهندسة المستوية

الهندسة غير المستوية

مواضيع عامة في الهندسة

التفاضل و التكامل

المعادلات التفاضلية و التكاملية

معادلات تفاضلية

معادلات تكاملية

مواضيع عامة في المعادلات

التحليل

التحليل العددي

التحليل العقدي

التحليل الدالي

مواضيع عامة في التحليل

التحليل الحقيقي

التبلوجيا

نظرية الالعاب

الاحتمالات و الاحصاء

نظرية التحكم

بحوث العمليات

نظرية الكم

الشفرات

الرياضيات التطبيقية

نظريات ومبرهنات

علماء الرياضيات

500AD

500-1499

1000to1499

1500to1599

1600to1649

1650to1699

1700to1749

1750to1779

1780to1799

1800to1819

1820to1829

1830to1839

1840to1849

1850to1859

1860to1864

1865to1869

1870to1874

1875to1879

1880to1884

1885to1889

1890to1894

1895to1899

1900to1904

1905to1909

1910to1914

1915to1919

1920to1924

1925to1929

1930to1939

1940to the present

علماء الرياضيات

الرياضيات في العلوم الاخرى

بحوث و اطاريح جامعية

هل تعلم

طرائق التدريس

الرياضيات العامة

نظرية البيان

الرياضيات : الاحتمالات و الاحصاء :

Least Squares Fitting

المؤلف:  Acton, F. S.

المصدر:  Analysis of Straight-Line Data. New York: Dover, 1966.

الجزء والصفحة:  ...

28-3-2021

4145

Least Squares Fitting

  LeastSquaresFitting

A mathematical procedure for finding the best-fitting curve to a given set of points by minimizing the sum of the squares of the offsets ("the residuals") of the points from the curve. The sum of the squares of the offsets is used instead of the offset absolute values because this allows the residuals to be treated as a continuous differentiable quantity. However, because squares of the offsets are used, outlying points can have a disproportionate effect on the fit, a property which may or may not be desirable depending on the problem at hand.

LeastSquaresOffsets

In practice, the vertical offsets from a line (polynomial, surface, hyperplane, etc.) are almost always minimized instead of the perpendicular offsets. This provides a fitting function for the independent variable X that estimates y for a given x (most often what an experimenter wants), allows uncertainties of the data points along the x- and y-axes to be incorporated simply, and also provides a much simpler analytic form for the fitting parameters than would be obtained using a fit based on perpendicular offsets. In addition, the fitting technique can be easily generalized from a best-fit line to a best-fit polynomial when sums of vertical distances are used. In any case, for a reasonable number of noisy data points, the difference between vertical and perpendicular fits is quite small.

The linear least squares fitting technique is the simplest and most commonly applied form of linear regression and provides a solution to the problem of finding the best fitting straight line through a set of points. In fact, if the functional relationship between the two quantities being graphed is known to within additive or multiplicative constants, it is common practice to transform the data in such a way that the resulting line is a straight line, say by plotting T vs. sqrt(l) instead of T vs. l in the case of analyzing the period T of a pendulum as a function of its length l. For this reason, standard forms for exponential, logarithmic, and power laws are often explicitly computed. The formulas for linear least squares fitting were independently derived by Gauss and Legendre.

For nonlinear least squares fitting to a number of unknown parameters, linear least squares fitting may be applied iteratively to a linearized form of the function until convergence is achieved. However, it is often also possible to linearize a nonlinear function at the outset and still use linear methods for determining fit parameters without resorting to iterative procedures. This approach does commonly violate the implicit assumption that the distribution of errors is normal, but often still gives acceptable results using normal equations, a pseudoinverse, etc. Depending on the type of fit and initial parameters chosen, the nonlinear fit may have good or poor convergence properties. If uncertainties (in the most general case, error ellipses) are given for the points, points can be weighted differently in order to give the high-quality points more weight.

Vertical least squares fitting proceeds by finding the sum of the squares of the vertical deviations R^2 of a set of n data points

 R^2=sum[y_i-f(x_i,a_1,a_2,...,a_n)]^2

(1)

from a function f. Note that this procedure does not minimize the actual deviations from the line (which would be measured perpendicular to the given function). In addition, although the unsquared sum of distances might seem a more appropriate quantity to minimize, use of the absolute value results in discontinuous derivatives which cannot be treated analytically. The square deviations from each point are therefore summed, and the resulting residual is then minimized to find the best fit line. This procedure results in outlying points being given disproportionately large weighting.

The condition for R^2 to be a minimum is that

 (partial(R^2))/(partiala_i)=0

(2)

for i=1, ..., n. For a linear fit,

 f(a,b)=a+bx,

(3)

so

 R^2(a,b)=sum_(i=1)^n[y_i-(a+bx_i)]^2

(4)

 (partial(R^2))/(partiala)=-2sum_(i=1)^n[y_i-(a+bx_i)]=0

(5)

 (partial(R^2))/(partialb)=-2sum_(i=1)^n[y_i-(a+bx_i)]x_i=0.

(6)

These lead to the equations

na+bsum_(i=1)^(n)x_i = sum_(i=1)^(n)y_i

(7)

asum_(i=1)^(n)x_i+bsum_(i=1)^(n)x_i^2 = sum_(i=1)^(n)x_iy_i.

(8)

In matrix form,

 [n sum_(i=1)^(n)x_i; sum_(i=1)^(n)x_i sum_(i=1)^(n)x_i^2][a; b]=[sum_(i=1)^(n)y_i; sum_(i=1)^(n)x_iy_i],

(9)

so

 [a; b]=[n sum_(i=1)^(n)x_i; sum_(i=1)^(n)x_i sum_(i=1)^(n)x_i^2]^(-1)[sum_(i=1)^(n)y_i; sum_(i=1)^(n)x_iy_i].

(10)

The 2×2 matrix inverse is

 [a; b]=1/(nsum_(i=1)^(n)x_i^2-(sum_(i=1)^(n)x_i)^2)[sum_(i=1)^(n)y_isum_(i=1)^(n)x_i^2-sum_(i=1)^(n)x_isum_(i=1)^(n)x_iy_i; nsum_(i=1)^(n)x_iy_i-sum_(i=1)^(n)x_isum_(i=1)^(n)y_i],

(11)

so

a = (sum_(i=1)^(n)y_isum_(i=1)^(n)x_i^2-sum_(i=1)^(n)x_isum_(i=1)^(n)x_iy_i)/(nsum_(i=1)^(n)x_i^2-(sum_(i=1)^(n)x_i)^2)

(12)

= (y^_(sum_(i=1)^(n)x_i^2)-x^_sum_(i=1)^(n)x_iy_i)/(sum_(i=1)^(n)x_i^2-nx^_^2)

(13)

b = (nsum_(i=1)^(n)x_iy_i-sum_(i=1)^(n)x_isum_(i=1)^(n)y_i)/(nsum_(i=1)^(n)x_i^2-(sum_(i=1)^(n)x_i)^2)

(14)

= ((sum_(i=1)^(n)x_iy_i)-nx^_y^_)/(sum_(i=1)^(n)x_i^2-nx^_^2)

(15)

(Kenney and Keeping 1962). These can be rewritten in a simpler form by defining the sums of squares

ss_(xx) = sum_(i=1)^(n)(x_i-x^_)^2

(16)

= (sum_(i=1)^(n)x_i^2)-nx^_^2

(17)

ss_(yy) = sum_(i=1)^(n)(y_i-y^_)^2

(18)

= (sum_(i=1)^(n)y_i^2)-ny^_^2

(19)

ss_(xy) = sum_(i=1)^(n)(x_i-x^_)(y_i-y^_)

(20)

= (sum_(i=1)^(n)x_iy_i)-nx^_y^_,

(21)

which are also written as

sigma_x^2 = (ss_(xx))/n

(22)

sigma_y^2 = (ss_(yy))/n

(23)

cov(x,y) = (ss_(xy))/n.

(24)

Here, cov(x,y) is the covariance and sigma_x^2 and sigma_y^2 are variances. Note that the quantities sum_(i=1)^(n)x_iy_i and sum_(i=1)^(n)x_i^2 can also be interpreted as the dot products

sum_(i=1)^(n)x_i^2 = x·x

(25)

sum_(i=1)^(n)x_iy_i = x·y.

(26)

In terms of the sums of squares, the regression coefficient b is given by

 b=(cov(x,y))/(sigma_x^2)=(ss_(xy))/(ss_(xx)),

(27)

and a is given in terms of b using (◇) as

 a=y^_-bx^_.

(28)

The overall quality of the fit is then parameterized in terms of a quantity known as the correlation coefficient, defined by

 r^2=(ss_(xy)^2)/(ss_(xx)ss_(yy)),

(29)

which gives the proportion of ss_(yy) which is accounted for by the regression.

Let y^^_i be the vertical coordinate of the best-fit line with x-coordinate x_i, so

 y^^_i=a+bx_i,

(30)

then the error between the actual vertical point y_i and the fitted point is given by

 e_i=y_i-y^^_i.

(31)

Now define s^2 as an estimator for the variance in e_i,

 s^2=sum_(i=1)^n(e_i^2)/(n-2).

(32)

Then s can be given by

 s=sqrt((ss_(yy)-bss_(xy))/(n-2))=sqrt((ss_(yy)-(ss_(xy)^2)/(ss_(xx)))/(n-2))

(33)

(Acton 1966, pp. 32-35; Gonick and Smith 1993, pp. 202-204).

The standard errors for a and b are

SE(a) = ssqrt(1/n+(x^_^2)/(ss_(xx)))

(34)

SE(b) = s/(sqrt(ss_(xx))).

(35)


REFERENCES:

Acton, F. S. Analysis of Straight-Line Data. New York: Dover, 1966.

Bevington, P. R. Data Reduction and Error Analysis for the Physical Sciences. New York: McGraw-Hill, 1969.

Chatterjee, S.; Hadi, A.; and Price, B. "Simple Linear Regression." Ch. 2 in Regression Analysis by Example, 3rd ed. New York: Wiley, pp. 21-50, 2000.

Edwards, A. L. "The Regression Line Y on X." Ch. 3 in An Introduction to Linear Regression and Correlation. San Francisco, CA: W. H. Freeman, pp. 20-32, 1976.

Farebrother, R. W. Fitting Linear Relationships: A History of the Calculus of Observations 1750-1900. New York: Springer-Verlag, 1999.

Gauss, C. F. "Theoria combinationis obsevationum erroribus minimis obnoxiae." Werke, Vol. 4. Göttingen, Germany: p. 1, 1823.

Gonick, L. and Smith, W. The Cartoon Guide to Statistics. New York: Harper Perennial, 1993.

Kenney, J. F. and Keeping, E. S. "Linear Regression, Simple Correlation, and Contingency." Ch. 8 in Mathematics of Statistics, Pt. 2, 2nd ed. Princeton, NJ: Van Nostrand, pp. 199-237, 1951.

Kenney, J. F. and Keeping, E. S. "Linear Regression and Correlation." Ch. 15 in Mathematics of Statistics, Pt. 1, 3rd ed. Princeton, NJ: Van Nostrand, pp. 252-285, 1962.

Lancaster, P. and Šalkauskas, K. Curve and Surface Fitting: An Introduction. London: Academic Press, 1986.

Laplace, P. S. "Des méthodes analytiques du Calcul des Probabilités." Ch. 4 in Théorie analytique des probabilités, Livre 2, 3rd ed. Paris: Courcier, 1820.

Lawson, C. and Hanson, R. Solving Least Squares Problems. Englewood Cliffs, NJ: Prentice-Hall, 1974.

Ledvij, M. "Curve Fitting Made Easy." Industrial Physicist 9, 24-27, Apr./May 2003.

Nash, J. C. Compact Numerical Methods for Computers: Linear Algebra and Function Minimisation, 2nd ed. Bristol, England: Adam Hilger, pp. 21-24, 1990.

Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; and Vetterling, W. T. "Fitting Data to a Straight Line" "Straight-Line Data with Errors in Both Coordinates," and "General Linear Least Squares." §15.2, 15.3, and 15.4 in Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd ed. Cambridge, England: Cambridge University Press, pp. 655-675, 1992.

Whittaker, E. T. and Robinson, G. "The Method of Least Squares." Ch. 9 in The Calculus of Observations: A Treatise on Numerical Mathematics, 4th ed. New York: Dover, pp. 209-, 1967.

York, D. "Least-Square Fitting of a Straight Line." Canad. J. Phys. 44, 1079-1086, 1966.

EN

تصفح الموقع بالشكل العمودي