Next: A nod toward statistics Up: Least Square Fitting Previous: Fitting a circle

Robust fitting

Let us return momentarily to the problem discussed in §1: given some data (x₁, y₁),...,(x_n, y_n), we found the best line that fits it. This was accomplished by measuring the square of the vertical distance from each data point and the line y = mx + b, which led us to consider the error function

E(m, b) = $\displaystyle \sum_{i=1}^{n}$ (mx_i + b - y_i)² .

The problem was solved by finding the values of m and b where E achieves its minimum.

However, the square of the vertical distance is only one of many reasonable ways of measuring the distance from the data points to the line mx + b. For example, it is perfectly reasonable to consider instead the expression 1 + (mx_i + b - y_i)². If (x_i, y_i) is on the line, this value would be equal to 1, and its logarithm would then be zero. Thus, the function

E(m, b) = $\displaystyle \sum_{i=1}^{n}$ ln $\displaystyle \left(\vphantom{ 1 + (mx_i+b-y_i)^2 }\right.$ 1 + (mx_i + b - y_i)² $\displaystyle \left.\vphantom{ 1 + (mx_i+b-y_i)^2 }\right)$ ,

is also a reasonable way of measuring the distance from the data points to the line y = mx + b, and its minimum would produce a best fit line in this other sense. Similarly, We may argue that the function

E(m, b) = $\displaystyle \sum_{i=1}^{n}$ $\displaystyle \left\vert\vphantom{ mx_i+b-y_i }\right.$ mx_i + b - y_i $\displaystyle \left.\vphantom{ mx_i+b-y_i }\right\vert$ .

is another reasonable alternative, though this time Eis not even differentiable in the region of interest (if a data point (x_i, y_i) happens to be exactly on the line, mx_i + b - y_i = 0 and the absolute value is not differentiable at 0).

There is a priori no particular reason to prefer one error function over another. And for each choice of such we could end up with a problem whose solution exists and produces a line that best fit the data. Then we could compute the value of the error function for the best fit line, value that depends upon the choice of function made. A canonical way of choosing the best error function could be that for which this number is the smallest. But finding such a function is quite a difficult, if not impossible, problem to solve. If we limit our attention to linear estimators, in a sense to be explained in the last section of this chapter, the solution obtained in §1 is optimal. In here, we pursue further the problem of finding the best fit according to the two error functions introduced above.

Notice that if $\epsilon_{i}^{}$ = mx_i + b - y_i, data with large errors $\epsilon_{i}^{}$ excert a huge influence on the original error function of §1 (because we use $\sum$ $\epsilon_{i}^{2}$ ). The two error functions mentioned above grow linearly (or sub-linearly) with | $\epsilon_{i}^{}$ |, causing the tails to count much less heavily. We will see this when comparing the different results.

In order to solve the problem now at hand, we first generate some ``noisy'' data. We load the routines in lsq_data.txt:

> read(`lsq_data.txt`);

$\begin{maplettyout}defined line\_pts(), bad\_line\_pts(), quadratic\_pts(), cubic\_pts(), and circle\_pts() \end{maplettyout}$
The data and its graphical visualization can then be obtained by:

> pts:=bad_line_pts():

> plot(pts,style=point);

$\begin{mfigure}\centerline{ \psfig {width=2in,angle=270,figure=robust01.eps}} \end{mfigure}$

> epsilon:= (pt,m,b) -> (m*pt[1]+b-pt[2]);

$\begin{maplelatex}\begin{displaymath} \varepsilon := ({\it pt}, \,m, \,b)\rightarrow m\,{{\it pt}_{1}} + b - {{\it pt}_{2}} \end{displaymath} \end{maplelatex}$

In order to compare results, let us first find the line given by least squares:

> H:=(m,b)->sum(epsilon(pts[i],m,b)^2,i=1..nops(pts));

$\begin{maplelatex}\begin{displaymath} H := (m, \,b)\rightarrow {\displaystyle \s... ...\,\varepsilon ({{\it pts}_{i}}, \,m, \,b)^{2} \end{displaymath} \end{maplelatex}$

> sol:=solve({diff(H(m,b),m)=0, diff(H(m,b),b)=0},{m,b});

$\begin{maplelatex}\begin{displaymath} {\it sol} := \{m=-2.058151695, \,b=-10.36674445\} \end{displaymath} \end{maplelatex}$

> p := plot(pts,style=point):
  l := plot(subs(sol,m*x+b),x=-10..10):
  plots[display](p,l,view=-40..20);

$\begin{mfigure}\centerline{ \psfig {width=2in,angle=270,figure=robust02.eps}} \end{mfigure}$

Now, let's try it minimizing ln(1 + $\epsilon_{i}^{2}$ )/2, which behaves like $\epsilon^{2}_{}$ for small errors, but grows very slowly for large ones.

> R:=(m,b)->sum(ln(1+epsilon(pts[i],m,b)^2/2),i=1..nops(pts));

$\begin{maplelatex}\begin{displaymath} R := (m, \,b)\rightarrow {\displaystyle \s... ... \varepsilon ({{\it pts}_{i}}, \,m, \,b)^{2}) \end{displaymath} \end{maplelatex}$

Here's what the functional we want to minimize looks like:

> plot3d(R(m,b),m=-5..0,b=-20..0,style=patchcontour, axes=boxed);

$\begin{mfigure}\centerline{ \psfig {width=2in,angle=270,figure=robust03.eps}} \end{mfigure}$

> plot3d(R(m,b),m=-4..-2,b=-18..-12,view=30..80, style=patchcontour, axes=boxed);

$\begin{mfigure}\centerline{ \psfig {width=2in,angle=270,figure=robust04.eps}} \end{mfigure}$

The equations we want to solve are not linear. In fact, they're messy enough that Maple needs some help to find the minimum. From the pictures above, we can choose an appropriate region.

> rs:=fsolve( {diff(R(m,b),m)=0, diff(R(m,b),b)=0},
              {m,b},m=-3.1..-2.8, b=-15..-14);

$\begin{maplelatex}\begin{displaymath} {\it rs} := \{b=-14.41947456, \,m=-2.942250792\} \end{displaymath} \end{maplelatex}$

And here we can compare the two lines found.

> p := plot(pts,style=point):
  l := plot(subs(sol,m*x+b),x=-10..10):
  s := plot(subs(rs,m*x+b),x=-10..10,color=blue):
  plots[display](p,l,s,view=-40..20);

$\begin{mfigure}\centerline{\psfig{width=2in,angle=270,figure=robust05.eps}} \end{mfigure}$

> plots[display](p,l,s);

$\begin{mfigure}\centerline{\psfig{width=2in,angle=270,figure=robust06.eps}} \end{mfigure}$

Let us now try to minimize $\sum$ | $\epsilon_{i}^{}$ |. Since this function is not even differentiable near $\epsilon_{i}^{}$ = 0, we have to work harder to get less:

> A:=(m,b)->sum(abs(epsilon(pts[i],m,b)),i=1..nops(pts));

$\begin{maplelatex}\begin{displaymath} A := (m, \,b)\rightarrow {\displaystyle \s... ...{{\it pts}_{i}}, \,m, \,b)\, \! \right\vert \end{displaymath} \end{maplelatex}$

Displaying the graph of this function gives us an idea about where its minimum lies:

> plot3d(A(m,b),m=-10..5,b=-30..5,style=patchcontour, axes=boxed);

$\begin{mfigure}\centerline{ \psfig {width=2in,angle=270,figure=robust07.eps}} \end{mfigure}$

But coercing Maple into finding a numerical approximation to it is not so easy. There are reliable and efficient ways to determine the location of the minimum to any precision (at least given some hints), but they are inappropriate for the scope of this chapter, and will not be considered here. Instead we may zero-in on the minimum by constraining the domain over which we display the graph:

> plot3d(A(m,b),m=-4..-2,b=-15..-13,style=patchcontour, axes=boxed);

$\begin{mfigure}\centerline{ \psfig {width=2in,angle=270,figure=robust08.eps}} \end{mfigure}$

Repeatedly narrowing in on the minimum graphically will locate a good choice of m and b.

Next: A nod toward statistics Up: Least Square Fitting Previous: Fitting a circle

Translated from LaTeX by Scott Sutherland
1999-12-08