Skip to main content

Section 4.6 Iterative Methods

Just like in case of equations in one variable, \(n\times n\) linear systems can be solved by iteration methods.

The main idea of these methods can be illustrated in the elementary case of linear equations in one variable. Re-write
\begin{equation*} ax=b \end{equation*}
as
\begin{equation*} x=(1-a)x+b. \end{equation*}
Hence, a solution of \(ax=b\) is a fixed point of
\begin{equation*} g(x)=(1-a)x+b. \end{equation*}
When is this point attractive? We know it: \(g'(x)=1-a\text{,}\) so it is attractive when \(|1-a| < 1\text{.}\)

Since the derivative is constant, in this case the sequence
\begin{equation*} x_0,g(x_0),g(g(x_0)),\dots \end{equation*}
converges to the unique solution of \(ax=b\text{,}\) namely \(-b/a\text{,}\) no matter what is the value of \(x_0\text{.}\)

We can repeat this for any \(n\times n\) non-degenerate linear system
\begin{equation*} Ax=b. \end{equation*}
We re-write the system as
\begin{equation*} x = ({\Bbb I}_n-A)x+b, \end{equation*}
where 1\(_n\) is the identity matrix, so that the solution is a fixed point of the linear map
\begin{equation*} g(x) = ({\Bbb I}_n-A)x+b. \end{equation*}
When will the iterates
\begin{equation*} x_0,g(x_0),g(g(x_0)),\dots \end{equation*}
converge to the solution \(A^{-1}b\text{?}\)
Notice that
\begin{gather*} g(x) = ({\Bbb I}_n-A)x+b\\ g(g(x)) = ({\Bbb I}_n-A)\left(({\Bbb I}_n-A)x+b\right)+b = ({\Bbb I}_n-A)^2x+({\Bbb I}_n-A)b+b \end{gather*}
so that, in general,
\begin{equation*} g^n(x) = ({\Bbb I}_n-A)^nx + \left[({\Bbb I}_n-A)^n + \dots + ({\Bbb I}_n-A)^2 + ({\Bbb I}_n-A)\right]b. \end{equation*}
If \(\|{\Bbb I}_n-A\|<1\text{,}\) then \(({\Bbb I}_n-A)^n\to0\) and, like it happens for the geometric sum,
\begin{equation*} ({\Bbb I}_n-A)^n + \dots + ({\Bbb I}_n-A)^2 + ({\Bbb I}_n-A) \to \left[{\Bbb I}_n-({\Bbb I}_n-A)\right]^{-1}=A^{-1}. \end{equation*}
Hence
\begin{equation*} g^n(x) \to A^{-1}b, \end{equation*}
which is the unique solution of \(Ax=b\text{.}\)
For instance, let us apply this naif method to the system
\begin{equation*} \begin{pmatrix} 0.8&0.9\\ 0.3&0.6\\ \end{pmatrix}\cdot \begin{pmatrix} x\\ y\\ \end{pmatrix} = \begin{pmatrix} 3\\ 5\\ \end{pmatrix}, \end{equation*}
whose solution is \(x=-12.85...\text{,}\) \(y=14.76...\)
Jacobi and Gauss-Seidel Methods. Adding \({\Bbb I}_n\) to both sides is not optimal because the condition \(\|{\Bbb 1}_n-A\|<1\) is quite strict.

In general, one adds \(Qx\) to both sides and then "divides by \(Q\)", so that the iterating function is \(g(x) = ({\Bbb I}_n-Q^{-1}A)x+Q^{-1}b\)

The following two alternate choices are often used:
  1. \(Q=diag(A)\) (Jacobi method);
  2. \(Q=tril(A)\) (Gauss-Seidel method).

When can be sure that the choices above will give rise to a convergent method? Recall that a matrix is strictly diagonally dominant if every diagonal entry is larger, in absolute value, than the sum of the absolute values of all other entries of the same row.

A trivial example of diagonally dominant matrix is an invertible diagonal matrix. Other important cases of diagonally dominant metrices come from the discretization of Ordinary and Partial Differential Equations (see next section for a worked out example).

For instance, consider the diagonally dominant system

\begin{equation*} \begin{pmatrix} 3&1\\ 2&4\\ \end{pmatrix} \begin{pmatrix} x\\ y\\ \end{pmatrix} = \begin{pmatrix} 4\\ 6\\ \end{pmatrix}, \end{equation*}

By solving with respect to \(x\) the first and to \(y\) the second we get that
\begin{equation*} \begin{cases} x &= (-y+4)/3\\ y &= (-2x+6)/4\\ \end{cases} \end{equation*}
The solution of the system is \(x=1, y=1\text{.}\) We can use this as the step for our recursive algorithm: chosen any point \((x_0,y_0)\text{,}\) we set
\begin{equation*} \begin{cases} x_{k+1} &= (-y_k+4)/3\\ y_{k+1} &= (-2x_k+6)/4\\ \end{cases} \; \hbox{(Jacobi method)} \end{equation*}
or
\begin{equation*} \begin{cases} 3x_{k+1} &= (-y_k+4)/3\\ 4y_{k+1} &= (-2x_{k+1}+6)/4\\ \end{cases} \; \hbox{(Gauss-Seidel method)} \end{equation*}
The code below solves the system with an elementary implementation of the Jacobi method:
The code below solves the system with an elementary implementation of the Gauss-Seidel method:
SOR method. Both methods above can be slightly modified by replacing the right-hand-side with a linear combination of it with the previous point in the iteration, as shown in the example below (see also next section), so that the two coefficients \(\alpha\) and \(\omega\) sum to 1 (i.e. \(\alpha=1-\omega\)).

This technique is called Successive over-relation (SOR) method and can be much faster than both the Jacobi and Gauss-Seidel methods for an appropriate choice of \(\omega\text{.}\) In general the method converges for \(0<\omega<2\) and the optimal choice for \(\omega\) is given by
\begin{equation*} \omega_{opt} = 1+\left(\frac{\mu}{1+\sqrt{1-\mu^2}}\right) \end{equation*}
where \(\mu\) is the largest eigenvalue of the matrix \({\Bbb I}_n-Q^{-1}A\text{.}\)

For instance, in the case above
\begin{equation*} {\Bbb I}_n-Q^{-1}A = \begin{pmatrix}0&-1/3\cr -1/2&0\cr\end{pmatrix} \end{equation*}
whose eigenvalues are about \(\pm0.41\text{,}\) so that \(\omega_{opt}\simeq1.04\text{.}\)
Conclusions. So, when to use iterative methods and when the use the LU method? As a rule of thumb, in case of large systems iterative methods can be more efficient than direct methods such as the LU decomposition.