Solving eigenvalue problems numerically

Objective

Goal of this chapter

Study numerical methods to find the eigenpairs $(\mathbf{\boldsymbol{v}}_i, \mathbf{\boldsymbol{\lambda}}_i)\in \mathbf C^n \times \mathbf C$ satisfying \[ \mathsf A \mathbf{\boldsymbol{v}}_i= \lambda_i \mathbf{\boldsymbol{v}}_i, \] where $\mathsf A \in \mathbf C^{n \times n}$.

$~$

Simplifying assumption throughout this chapter: the matrix $\mathsf A$ is diagonalizable.

Notation: eigenvalues $|\lambda_1| \geqslant\dotsc \geqslant|\lambda_n|$, normalized eigenvectors $\mathbf{\boldsymbol{v}}_1, \dotsc, \mathbf{\boldsymbol{v}}_n$. \[ \mathsf A \mathsf V = \mathsf V \mathsf D, \qquad \mathsf V = \begin{pmatrix} \mathbf{\boldsymbol{v}}_1 & \dots & \mathbf{\boldsymbol{v}}_n \end{pmatrix}, \qquad \mathsf D = {\rm diag}(\lambda_1, \dotsc, \lambda_n). \]

Finding all the eigenvalues of a matrix is expensive, even with the best methods:
```
using LinearAlgebra
A = rand(2000, 2000)
@time eigen(A)
```
```
6.694395 seconds
```
${\color{green} \rightarrow}$ We first focus on methods for finding just one or a few eigenpairs.

Program of today

Simple vector iterations

calculate just one eigenpair
Simultaneous iteration

calculate multiple eigenpairs at the same time
Application: PageRank

find the invariant distribution of a Markov chain

Remark: can we calculate the roots of the characteristic polynomial?

Cost of calculating the characteristic polynomial scales as ${\color{red}n!}$

${\color{green} \rightarrow}$ not a viable approach…

The power iteration

Power iteration

The power iteration is a method for finding $\color{blue}(\lambda_1, v_1)$: \[ \fcolorbox{blue}{lightgreen}{$ \mathbf{\boldsymbol{x}}_{k+1} = \frac{\mathsf A \mathbf{\boldsymbol{x}}_{k}}{{\lVert {\mathsf A \mathbf{\boldsymbol{x}}_{k}} \rVert}} $} \] Given $\mathbf{\boldsymbol{x}}_k \approx \mathbf{\boldsymbol{v}}_1$, the eigenvalue is calculated from the Rayleigh quotient: \[ \fcolorbox{blue}{lightgreen}{$ \lambda_{1} \approx \frac{\mathbf{\boldsymbol{x}}_{k}^* \mathsf A \mathbf{\boldsymbol{x}}_k}{\mathbf{\boldsymbol{x}}_k^* \mathbf{\boldsymbol{x}}_k} $} \]

Normalization is necessary to avoid numerical Inf.

A = [2 1; 1 2]
x = rand(2)
for i in 1:20
    x = A*x
    x /= √(x'x)
end
println("Eigenvector: $x, eigenvalue: $(x'A*x)")

Eigenvector: [0.7071067812838212, 0.7071067810892738], eigenvalue: 3.0

Convergence of the power iteration

Notation: let $\angle$ denote the acute angle between $\mathbf{\boldsymbol{x}}$ and $\mathbf{\boldsymbol{y}}$: \[ \angle(\mathbf{\boldsymbol{x}}, \mathbf{\boldsymbol{y}}) = \arccos\left( \frac{{\lvert {\mathbf{\boldsymbol{x}}^* \mathbf{\boldsymbol{y}}} \rvert}}{\sqrt{\mathbf{\boldsymbol{x}}^* \mathbf{\boldsymbol{x}}} \sqrt{\mathbf{\boldsymbol{y}}^* \mathbf{\boldsymbol{y}}}}\right). \]

Since the eigenvectors of $\mathsf A$ span $\mathbf C^n$, any vector $\mathbf{\boldsymbol{x}}_0$ may be decomposed as \[ \mathbf{\boldsymbol{x}}_0 = \alpha_1 \mathbf{\boldsymbol{v}}_1 + \dotsb + \alpha_n \mathbf{\boldsymbol{v}}_n. \]

Proposition: convergence of the power iteration started from $\mathbf{\boldsymbol{x}}_0$

Suppose that ${\lvert {\lambda_1} \rvert} > {\lvert {\lambda_2} \rvert}$ and $\alpha_1 \neq 0$. Then $(\mathbf{\boldsymbol{x}}_k)_{k \geqslant 0}$ satisfies \[ \lim_{k \to \infty} \angle(\mathbf{\boldsymbol{x}}_k, \mathbf{\boldsymbol{v}}_1) = 0. \] The sequence $(\mathbf{\boldsymbol{x}}_k)$ is said to converge essentially to $\mathbf{\boldsymbol{v}}_1$. $~$

Proof

By construction \[ \mathbf{\boldsymbol{x}}_k = \frac{\lambda_1^k \alpha_1 \mathbf{\boldsymbol{v}}_1 + \dotsb + \lambda_n^k \alpha_n \mathbf{\boldsymbol{v}}_n}{{\lVert {\lambda_1^k \alpha_1 \mathbf{\boldsymbol{v}}_1 + \dotsb + \lambda_n^k \alpha_n \mathbf{\boldsymbol{v}}_n} \rVert}} = \frac{\lambda_1^k \alpha_1}{|\lambda_1^k \alpha_1|} \frac{\mathbf{\boldsymbol{v}}_1 + \frac{\lambda_2^k \alpha_2}{\lambda_1^k \alpha_1} \mathbf{\boldsymbol{v}}_2 + \dotsb + \frac{\lambda_n^k \alpha_2}{\lambda_1^k \alpha_1} \mathbf{\boldsymbol{v}}_n }{\left\lVert\mathbf{\boldsymbol{v}}_1 + \frac{\lambda_2^k \alpha_2}{\lambda_1^k \alpha_1} \mathbf{\boldsymbol{v}}_2 + \dotsb + \frac{\lambda_n^k \alpha_n}{\lambda_1^k \alpha_1} \mathbf{\boldsymbol{v}}_n\right\rVert}. \] Therefore \[ \mathbf{\boldsymbol{x}}_k \left(\frac{|\lambda_1^k \alpha_1|}{\lambda_1^k \alpha_1}\right) \xrightarrow[k \to \infty]{} \mathbf{\boldsymbol{v}}_1. \] (Notice that the bracketed factor on the left-hand side is a scalar of modulus 1.)

The dominant error term scales as $\color{red} |\lambda_2/\lambda_1|^k$.

Inverse and Rayleigh quotient iteration

Inverse iteration: to find $\lambda_J$, the eigenvalue that is nearest $\mu$

Let $\mathsf B = (\mathsf A - \mu \mathsf I)^{-1}$. The inverse iteration around $\mu \in \mathbf C$ is given by \[ \fcolorbox{blue}{lightgreen}{$ \mathbf{\boldsymbol{x}}_{k+1} = \frac{\mathsf B \mathbf{\boldsymbol{x}}_{k}}{{\lVert {\mathsf B \mathbf{\boldsymbol{x}}_{k}} \rVert}}. $} \] Given $\mathbf{\boldsymbol{x}}_k \approx \mathbf{\boldsymbol{v}}_J$, the eigenvalue is calculated from the Rayleigh quotient: \[ \fcolorbox{blue}{lightgreen}{$ \lambda_{J} \approx \frac{\mathbf{\boldsymbol{x}}_{k}^* \mathsf A \mathbf{\boldsymbol{x}}_k}{\mathbf{\boldsymbol{x}}_k^* \mathbf{\boldsymbol{x}}_k} $} \]

The dominant eigenvalue of $\mathsf B$ is given by $(\lambda_J - \mu)^{-1}$.
The associated eigenvector is $\mathbf{\boldsymbol{v}}_J$.

Use \ operator instead of calculating the inverse:

A = [2 1; 1 2]
x = rand(2)
μ = 0
B = A - μ*I
for i in 1:10
    x = B\x
    x /= √(x'x)
end
println("Eigenvector: $x, eigenvalue: $(x'A*x)")

Eigenvector: [-0.7070650980198242, 0.7071484618962388], eigenvalue: 1.0000000069495358

Can we adapt $\mu$ during the iteration to accelerate convergence?

Error of the inverse iteration scales as $\color{red} \left|\frac{\lambda_J - \mu} {\lambda_K - \mu}\right|^k$.

${\color{green} \rightarrow}$ suggests letting \[ \fcolorbox{blue}{lightgreen}{$ \mu_k = \frac{\mathbf{\boldsymbol{x}}_k^* \mathsf A \mathbf{\boldsymbol{x}}_k}{\mathbf{\boldsymbol{x}}_k^* \mathbf{\boldsymbol{x}}_k}. $} \]

This leads to the Rayleigh quotient iteration.

If convergence occurs, the associated eigenvalue converges cubically:

\[ |\lambda^{(k+1)} - \lambda_J| \leqslant C |\lambda^{(k)} - \lambda_J|^3. \]

${\color{green} \rightarrow}$ Number of significant digits is roughly tripled at each iteration.

A = [2 1; 1 2]
setprecision(500)
x = BigFloat.([1.2; 0.9])
μ = 0
for i in 1:6
    x = (A - μ*I)\x
    x /= √(x'x)
    μ = x'A*x
    println(μ)
end

2.689655172413793214338566074331922284628319727818488081860106227371262853189736760002654726539617400119969038165484095334893321873413803868251681795864
2.9876835222760986149331492272467355588185696109799282648595173771965397075426195063225678761434065353075142377089820027363840168362403721691362285304077
2.9999995241744513496497052218657690803207923292634754813048708145726507175594474194389978335452546331521437999768268695077710762632721314300139734193988
2.9999999999999999999730670707803381925912042248826047017208189162200256216460941091027939474791840614585910098581290345505474157692691833334644060848503
2.9999999999999999999999999999999999999999999999999999999999951158299301653110614230820165226784762374074665384291286744975203198864866424226623297764997
3.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000012