NA Roots and Conditioning

Section 3.1 Roots and Conditioning

Definition 3.1.1.

Given a function \(f(x)\text{,}\) we call root of \(f(x)\) any number \(r\) such that \(f(r)=0\text{.}\)

Often \(f(x)\) is not known exactly. For instance, \(f(x)\) might be a polynomial and we know its coefficients only approximatively. This is the case when \(f(x)\) is the interpolating polynomial of some set of data (see Chapter 7). Or perhaps when we know exactly the coefficients but, as we discussed in Chapter 1, they cannot be represented exactly in the floating point system we are using (usually double precision). Hence, the following question plays an important role in determining the "worst case scenario" error in computing a root of \(f(x)\text{.}\)

Let \(f(x)\) be a smooth function and let \(r\) be one of its roots. How much would \(r\) change after we slightly perturb \(f(x)\text{?}\)

Set \(\tilde f(x) = f(x)+\varepsilon h(x)\) and denote by \(r+\delta r\) the corresponding root of \(\tilde f(x)\text{.}\) Then, using Taylor expansion,

\begin{equation*} 0 = f(r+\delta r)+\varepsilon h(r+\delta r) = f'(r)\delta r+\varepsilon h(r)+\text{ terms of order 2 in }\varepsilon,\delta r. \end{equation*}

Hence,

\begin{equation*} \delta r \simeq -\frac{\varepsilon h(r)}{f'(r)}. \end{equation*}

By comparing with Section 2.3, this means that the absolute conditioning of the root-finding problem is

\begin{equation*} K(x_0,f) = \frac{1}{|f'(x_0)|}. \end{equation*}

Notice that this is the conditioning at 0 of the (local) inverse function \(f^{-1}\) -- not surprising, since finding \(r\) means precisely evaluating \(f^{-1}(0)\text{!}\)

In this case, it makes no sense to talk about relative conditioning since \(f(r)=0\) and so the relative change of \(f\) due to the summation of a perturbation \(\delta f\text{,}\) with \(\delta f(r)\neq0\text{,}\) is always infinite.

Code examples. The two codes below show graphically why a small first derivatie leads to an amplification of the inaccuracy on the position of the root.

The first code shows a function with small conditioning, namely a "not small" first derivative at the root. The function's graph is painted in red while the range of uncertainty of the function is painted in cyan -- in other words, the exact function lies somewhere within the cyan area. Since the derivative is larger than 1, the uncertainty factor on the function is roughly equal to the uncertainty factor on the root.

In the second code, the graph is close to horizontal nearby the root. Correspondingly, the inaccuracy on the position of the root is greatly amplified by the geometry of the graph.

Notice that, in the case above, \(r=1.6\text{,}\) \(\delta f\simeq 10^{-3}\) and \(f'(r)=10^{-4}\text{,}\) so that in principle \(\delta r\simeq10\text{.}\) The reason why actually \(\delta r\simeq0.3\) is due to the fact that, at about \(\delta r\simeq0.2\text{,}\) \(f''(x)\) starts growing and so the graph ceases to be horizontal -- for instance, \(|f''(1.7)/f'(1.7)|\simeq10^{-12}\) while \(|f''(1.8)/f'(1.8)|\simeq24\text{.}\) This is the reason why the \(\delta r\) for \(f(x)=(x-1.5)^2(x-1.6)(x-1.7)^2\) is roughly the same as for \(f(x)=(x-1.6)^5\text{,}\) even though, in principle, in this last case \(\delta r=\infty\text{.}\) Of course, in absence of accurate data on the function, in general one has to assume a "worst-case scenario" and use the upper bound \(1/|f'(r)|\text{.}\)