NA The Power method

Section 5.2 The Power method

The most common situation in applications is the determination of the eigenvalues of some symmetric positive-definite matrix, so from now on we assume we have such a matrix.

The power method is a very simple iterative method able to evaluate numerically the largest eigenvalue in absolute value of a matrix \(A\) (and so also its smallest one, which is the inverse of the largest eigenvalue of its inverse \(A^{-1}\)). We call this eigenvalue the dominant eigenvalue of \(A\text{.}\)

Denote by \(\{e_1,\dots,e_n\}\) a basis of eigenvectors, so that \(Ae_i=\lambda_ie_i\text{,}\) and take a generic vector \(v=v_1e_1+\dots+v_ne_n\) (generic means that no \(v_i\) component is zero).

Then

\begin{equation*} \begin{aligned} A(v)=&\;A(v_1e_1+\dots+v_ne_n) \\[1em] =&\;v_1 A(e_1)+ \dots + v_n A(e_n) \\[1em] =&\;v_1\lambda_1 e_1 + \dots + v_n\lambda_n e_n. \end{aligned} \end{equation*}

Hence, in general,

\begin{equation*} A^k(v) = v_1\lambda^k_1e_1+\dots+v_n\lambda^k_ne_n \end{equation*}

The algorithm. Sort the eigenvalues in decreasing order with respect to modulus, namely assume that

\begin{equation*} |\lambda_1|>|\lambda_2|>\dots>|\lambda_n|. \end{equation*}

Then, as \(k\to\infty\text{,}\) all terms \(\lambda^k_i\) with \(i>1\) become negligible with respect to the term with \(\lambda^k_1\) and so

\begin{equation*} A^k(v)\simeq\lambda^k_1e_1. \end{equation*}

Notice that this approximation gets better and better with higher and higher powers of \(A\text{.}\)

We exploit this fact in the following iterative algorithm:

Choose a random vector \(v_0\) and set some threshold \(\epsilon\text{;}\)
\(\hbox{until }\|v-vold\|>\epsilon\)
- \(\displaystyle vold = v\)
- \(\displaystyle z = Av\)
- \(\displaystyle v = z/\|z\|\)
end

We claim that, at the end of the loop,

\begin{equation*} \frac{v^TAv}{v^Tv} \end{equation*}

is the eigenvalue of highest modulus and \(v\) is one of its eigenvectors.

Indeed, notice that

\begin{equation*} a^Tb=\begin{pmatrix}a_1&\dots&a_n\end{pmatrix}\begin{pmatrix}b_1\\\dots\\b_n\end{pmatrix} =a_1b_1+\dots+a_nb_n=\langle a,b\rangle \end{equation*}

so that, for large \(k\text{,}\)

\begin{equation*} (A^kv_0)^T A^{k+1}v_0 = \langle A^kv,A^{k+1}v\rangle \simeq \langle \lambda^kv_1e_1,\lambda^{k+1}v_1e_1\rangle=\lambda_1^{2k+1}v_1^2 \end{equation*}

and

\begin{equation*} (A^kv_0)^T A^{k}v_0 = \langle A^kv,A^{k}v\rangle \simeq \langle \lambda^kv_1e_1,\lambda^{k}v_1e_1\rangle=\lambda_1^{2k}v_1^2, \end{equation*}

so that

\begin{equation*} \frac{(A^kv_0)^T A^{k+1}v_0}{(A^kv_0)^T A^{k}v_0} \simeq \lambda_1 \end{equation*}

In essence, this is exactly the method used by google to rank web pages when a user searches for keywords!

Finally, notice that the last value of \(v\) computed provides an approximation of an eigenvector of \(A\text{.}\)

An implementation in Python of the Power method. The code below evaluates the largest modulus eigenvalue of the matrix

\begin{equation*} \begin{pmatrix} 1&2\\ 3&4\\ \end{pmatrix}. \end{equation*}

We use the vector \(\begin{pmatrix}7\\-2\\\end{pmatrix}\) as starting point of the algorithm.