MAT 321 - PS4

Author: Marc Aurèle Gilles

Q1 Diagonalization of circulant matrices.

In these problems sets, we have used repeatedly that circulant matrices are diagonalized by the DFT to:

That is, we used the fact that:

$$C = F^{-1} \Lambda F $$

where $F$ is the DFT matrix, and $\text{diag}(\Lambda) = Fc$. The goal of is this exercise is to prove this identity.

Note that this is an eigendecomposition of $C$, where $V = F^{-1}$ is the matrix of eigenvectors. Recall that the DFT matrix is

$$ F = \frac{1}{\sqrt{n}} \begin{bmatrix} 1 & 1 & 1 & 1 & \cdots & 1 \\ 1 & \omega & \omega^2 & \omega^3 & \cdots & \omega^{n-1} \\ 1 & \omega^2 & \omega^4 & \omega^6 & \cdots & \omega^{2(n-1)} \\ 1 & \omega^3 & \omega^6 & \omega^9 & \cdots & \omega^{2(n-1)} \\ \vdots & \vdots & \vdots & \vdots & \ddots & \ddots \\ 1 & \omega^{n-1} & \omega^{2(n-1)} & \omega^{3(n-1)} & \cdots & \omega^{(n-1)(n-1)} \\ \end{bmatrix} $$

where $\omega = \exp(- \imath 2 \pi / n) $, $\imath^2 = -1$ (note that $\omega$ depends on $n$). Also recall that $F$ (with this normalization) is unitary, that is $F^{-1} = F^* = \bar{F}^T$, and $F$ is symmetric, so $F^{-1} = \bar{F}$.

We prove this identity in 3 steps:

  1. Let $D_n \in \mathbb{R}^{n \times n}$ be the circulant downward shift matrix. For $n = 5$, this matrix is:
$$ D_5 = \begin{bmatrix} & & & & 1 \\ 1 & & & & \\ & 1& & & \\ & & 1& & \\ & & & 1& \\ \end{bmatrix} $$

.

Show that $D_n {F}_{i+1} = \gamma_{i+1} {F}_{i+1} $ where ${F}_{i+1}$ is the $i+1$ column of ${F}$:

$$ F_{i+1} = \frac{1}{\sqrt{n}} \begin{bmatrix} 1 \\ {\omega}^{i}\\ {\omega}^{2i}\\ {\omega}^{3i}\\ \vdots \\ {\omega}^{(n-1)i}\\ \end{bmatrix} $$

and $\gamma_{i+1} = \bar{\omega}^i$. Conclude that $D_n F = F \bar{\Gamma} $. where $\bar{\Gamma}$ is a diagonal matrix with diagonal entries $\gamma_i$. This last equation implies $D_n = F \bar{\Gamma} F^{-1}$, and taking the complex conjugate of this equation, we get the desired: $D_n = F^{-1} \Gamma F$.

  1. Write any circulant matrix as $(n-1)$ degree polynomial in $D_n$. That is, find the coefficients $\alpha_k$ so that:
$$C = \sum_{k=0}^{n-1} \alpha_k D_n^k $$

Recall that $D_n^0$ is defined as the identity.

  1. Use the previous two parts to show $C = F^{-1} \Lambda F $ for some diagonal matrix $\Lambda$. Some inspection reveals that $\text{diag}(\Lambda) = Fc$, but you do not need to prove this.

Add your answer to Q1 here

Q2 The Bartels-Stewart algorithm

In this problem, we derive an algorithm for solving a special kind of matrix equation: Sylvester equations. A Sylvester equation is of the form: $$ AX - XB = F $$ where $A,X,B,F \in \mathbb{R}^{n \times n}$. Using the Kronecker formalism from HW3, we can solve this matrix equation in $\mathcal{O}(n^6)$, but we can do better: the Bartels-Stewart algorithm is an algorithm that solves this equation in $\mathcal{O}(n^3)$. The Bartels-Stewart algorithm is based on one clever observation: if $A$ and $B$ are diagonal matrices, then this equation is easy to solve. Say $A = \text{diag}(d)$, $B = \text{diag}(p)$, then the $(i,j)$-th entry of the Sylvester equation reads: $$ d_i x_{i,j} - x_{i,j} p_j = f_{i,j} $$ i.e., the system is uncoupled and $x_{i,j}$ can be easily found as: $$ x_{i,j} = \frac{f_{i,j}}{d_i - p_j} \ . $$ From this equation, it is clear that there is a unique solution if and only if $d_i - p_j \neq 0 \leftrightarrow d_i \neq p_j$ for all $i,j$.

To solve a general Sylvester equation, Bartels-Stewart reduces a general Sylvester equation to the diagonal case using an eigenvalue decomposition.

2.1

Assume $A$ and $B$ are diagonalizable, and their eigenvalues decompositions are

$$ A = U \Lambda U^{-1} \qquad B = V\Sigma V^{-1} $$

Show that the matrices $\hat{X} = U^{-1} X V $, $\hat{F} =U^{-1} F V $ satisfy the following transformed Sylvester equation:

$$ \Lambda \hat{X}- \hat{X} \Sigma = \hat{F} $$

2.2

Use the previous part to show that if $A$ and $(-B)$ are symmetric positive definite matrices, then the Sylvester equation has a unique solution.

2.3

Use 1.1 to design and implement an $\mathcal{O}(n^3)$ solver for a general Sylvester equation. You may assume that the matrices are diagonalizable but not SPD.

You may use np.linalg.eig() which computes an eigenvalue decomposition of $A$ in $\mathcal{O}(n^3)$.

Reminder: the number one sin of Numerical Linear Algebra is to form the inverse of a matrix. In particular, these two commands should be avoided like the plague: np.linalg.inv(A) or np.linalg.solve(A,I) (they do the same computation). Instead, you should use: np.linalg.solve(A,F) to compute $X = A^{-1} F$ and np.linalg.solve(A.T,F.T).T to compute $X = F A^{-1} = ({A^{-1}}^T {F}^T)^T$

Add your answer to parts 1 and 2 here

Q3 Fast PDE solver: Poisson's equation

In HW2 we derived a numerical scheme to solve Poisson's equation in 1D:

$$ - \frac{d^2}{dx^2} u(x) = f(x) \qquad u(0) = u(1) = 0 $$

The discretized system was of the form:

$$ T u = h^2 f $$

where:

$$ T = \begin{bmatrix} 2 & -1 & & & & \\ -1 & 2 & -1 & & & \\ & -1 & 2 & -1 & & \\ & & \ddots & \ddots & \ddots & \\ & & & -1 & 2 & -1 \\ & & & & -1 & 2 \\ \end{bmatrix}, f = \begin{bmatrix} f(h) \\ f(2h) \\ \vdots \\ f(n h) \end{bmatrix} $$

T is tridiagonal, so we now know how to solve this system in $O(n)$ operations. All variations of the 1D problem are equally easy to solve.

The 2D case is much more interesting. Poisson's equation on the square with homogeneous Dirichlet conditions is:

$$ - \frac{d^2}{dx^2} u(x,y) - \frac{d^2}{dy^2}u(x,y) = f(x,y)\\ \qquad u(x,0) = u(x,1) =0\\ \qquad u(0,y) = u(1,y) = 0 $$

Suppose we discretize the equation above on a $n \times n$ uniform grid $(x_i, y_j)$, with $x_i = i h $, $y_j = jh$, with $h = \frac{1}{n+1}$. We write $F_{i,j} = h^2 f(x_i, y_j)$, that is entry $(i,j)$ of the matrix $F \in \mathbb{R}^{n \times n}$ is the function $f(x,y)$ evaluated at $(x_i, y_j)$ times $h^2$. We seek approximate value $U_{i,j} \approx u(x_i, y_j)$ through finite differences once again.

We can think of the operator $ u \rightarrow - \frac{d^2}{dx^2}u$ and boundary conditions $ u(x,0) = u(x,1) = 0 $ as being applied independently to the columns of $U$, thus they are discretized by the matrix $ U \rightarrow TU$. Similarly, we can think of the operator $ u \rightarrow - \frac{d^2}{dy^2}u$ and boundary conditions $ u(0,y) = u(1,y) = 0 $ as being applied independently to the rows of $U$, thus they are discretized by the matrix $ U \rightarrow U T^T = UT$.

Hence, the operator $ u \rightarrow - \frac{d^2}{dy^2}u - \frac{d^2}{dx^2}u$ and boundary conditions is discretized as $U \rightarrow TU + UT$, and we arrive at the matrix equation:

$$ UT + TU = F$$

Thus, the 2D Poisson finite-difference approximation may be computed by solving this matrix linear system for $U$.

Using the Kronecker formalism from the last HW, we could solve this in $O(n^6)$ operation: not great. Bartles-Stewart is much better, with $O(n^3)$ complexity, but we are not out of tricks just yet!

A wonderful property of the second-order finite difference matrix is that we can compute its eigenvalue decomposition analytically: $$T = S^{-1} \Lambda S$$ where $S$ is the Discrete Sine Transform (DST) matrix defined componentwise by $S_{kj} = \frac{1}{\sqrt{2(n+1)}}\sin(\frac{kj\pi}{n+1})$, and eigenvalues are $\lambda_j = 4 \sin^2 ( \frac{j \pi}{2(n+1)})$. The DST is the imaginary part of the complex Discrete Fourier Matrix (DFT) $S = \text{imag}(F)$. Thanks to this connection, one can derive a $\mathcal{O}(n \log n)$ algorithm to compute $Sx$ based on the FFT, which can be obtained by scipy.fft.dst(x, type = 1, norm = "ortho"). The discrete Sine matrix has another nice property: it is both orthogonal and symmetric, thus $S = S^T = S^{-1}$ (when properly normalized, which is what the norm = "ortho" does).

Your task is to write a solver for Poisson's equation in $\mathcal{O}(n^2 \log n )$ using this property of the second-order finite difference matrix $T$ and the Bartels-Stewart algorithm.

(A programming note: if you want to compute $SX$ (where X is a matrix), you should use scipy.fft.dst(X, type = 1, norm = "ortho", axis =1). If you want to compute $XS^T$, you should compute scipy.fft.dst(X, type = 1, norm = "ortho", axis =0). You can also use the dst helper function below if you get tired of writing this long statement).

4. Crafting the jewel

We spent two weeks deriving one of the jewels of numerical linear algebra: the shifted-QR algorithm. Time to pull all the pieces together. When applied to real symmetric matrices, the algorithm looks as follows.

$Q_0 A^{(0)} Q_0 ^T = A \qquad $ (1) (Compute a reduction to tridiagonal by Householder Reflection)

$M \leftarrow m$

for $k = 1,2, \dots$

$\mu_k = A_{M,M}^{(k-1)} \qquad $ (Compute Rayleigh quotient)

$Q^{(k)}, R^{(k)} \leftarrow A^{(k-1)} - \mu_k I \qquad $ (2) (Compute QR decomposition of the shifted matrix)

$A^{(k)} \leftarrow R^{(k)} Q^{(k)} + \mu_k I \qquad $ (3) (Recombine factors)

if $|A_{M,M-1}^{(k)} | < \text{tol}$: $\qquad $ (4) (If converged to eigenvalue) >

Add $A_{M,M}^{(k)}$ to list of eigenvalues

$A^{(k)} \leftarrow A^{(k)}_{[1:M-1, 1:M-1]} \qquad $ (Deflate: ignore last row and column)

$M \leftarrow M-1$

if $ M = 1: $ (The trivial 1x1 matrix case: the entry is the eigenvalue) Add $A_{1,1}^{(k)}$ to list of eigenvalues

terminate

end while

Step (1) was implemented in precept 7,

Step (2) - (3), could be done in $\mathcal{O}(m)$, but we will ignore this specific problem for now, and you can use np.linalg.qr instead, which performs that operation in $\mathcal{O}(m^3)$.

Step (4) is the deflation step. We check if we have converged to an eigenvalue.

As a result of the shifting we recover the $O(\epsilon^3)$ converge of the Rayleigh quotient iteration. Thus, we expect to need only a handful of iterations per eigenvalues (3-4).

Your task is to implement the shifted-QR algorithm above and keep track of how many QR iterations you are doing. Make sure that you only need $< 4 n$ QR iterations to compute all eigenvalues.

Extra credit: