Author: Marc Aurèle Gilles
In this exercise, we derive the "most important numerical algorithm of our lifetime" (G. Strang): the Fast Fourier Transform (FFT). The FFT is a fast algorithm to compute the Discrete Fourier Transform (DFT). The DFT of a vector $x \in \mathbb{C}^n$ is the vector $y\in \mathbb{C}^n$ defined by:
$$ y_k = \sum_{j=0}^{n-1} x_j \exp(-i2\pi k j / n ) \ . $$(Here, vectors are 0-indexed). We restrict our attention to the case where the size $n$ is a power of $2$, although the ideas extend to the general case. The DFT is the result of the matrix-vector multiply $Fx$ with the matrix
$$ F = \begin{bmatrix} 1 & 1 & 1 & 1 & \cdots & 1 \\ 1 & \omega & \omega^2 & \omega^3 & \cdots & \omega^{n-1} \\ 1 & \omega^2 & \omega^4 & \omega^6 & \cdots & \omega^{2(n-1)} \\ 1 & \omega^3 & \omega^6 & \omega^9 & \cdots & \omega^{2(n-1)} \\ \vdots & \vdots & \vdots & \vdots & \ddots & \ddots \\ 1 & \omega^{n-1} & \omega^{2(n-1)} & \omega^{3(n-1)} & \cdots & \omega^{(n-1)(n-1)} \\ \end{bmatrix} $$where $\omega = \exp(-i 2 \pi /n) $ (note that $\omega$ is a function of n).
To illustrate the FFT, we take $n = 8$ (this is the presentation in Golub & Van Loan).
$$ F_8 = \begin{bmatrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & \omega & \omega^2 & \omega^3 & \omega^4 & \omega^5 & \omega^6 & \omega^7 \\ 1 & \omega^2 & \omega^4 & \omega^6 & \omega^1 & \omega^2 & \omega^4 & \omega^6 \\ 1 & \omega^3 & \omega^6 & \omega & \omega^4 & \omega^7 & \omega^2 & \omega^5 \\ 1 & \omega^4 & 1 & \omega^4 & 1 & \omega^4 & 1 & \omega^4 \\ 1 & \omega^5 & \omega^2 & \omega^7 & \omega^4 & \omega & \omega^6 & \omega^3 \\ 1 & \omega^6 & \omega^4 & \omega^2 & 1 & \omega^6 & \omega^4 & \omega^2 \\ 1 & \omega^7 & \omega^6 & \omega^5 & \omega^4 & \omega^3 & \omega^2 & \omega \\ \end{bmatrix} $$where we have used the fact that $$\omega^{n+k} = \omega^{n}\omega^{k} = \exp(-i 2 \pi / n)^{n} \omega^k = \exp(-i 2 \pi n/n)\omega^k = 1 \omega^k $$
Next, we reorder the columns so that the odd-indexed columns come first (they are now ordered as [1,3,5,7,2,4,6,8]), which is equivalent to multiplying $F_8$ by the permutation matrix on the right:
$$ P_8 = \begin{bmatrix} 1 & & & & & & & \\ & & & &1 & & & \\ & 1& & & & & & \\ & & & & &1 & & \\ & &1 & & & & & \\ & & & & & & 1 & \\ & & &1 & & & & \\ & & & & & & &1 \\ \end{bmatrix} $$Thus, the matrix $F_8P_8$ is:
$$F_8P_8 = \left[ \begin{array}{cccc|cccc} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & \omega^2 & \omega^4 & \omega^6 & \omega & \omega^3 & \omega^5 & \omega^7 \\ 1 & \omega^4 & 1 & \omega^4 & \omega^2 & \omega^6 & \omega^2 & \omega^6 \\ 1 & \omega^6 & \omega^4 & \omega^2 & \omega^4 & \omega^7 & \omega^2 & \omega^5 \\ \hline 1 & 1 & 1 & 1 & -1 & -1 & -1 & -1 \\ 1 & \omega^2 & \omega^4 & \omega^6 & -\omega & -\omega^3 & -\omega^5 & -\omega^7 \\ 1 & \omega^4 & 1 & \omega^4 & \omega^2 & -\omega^6 & -\omega^2 & -\omega^6 \\ 1 & \omega^6 & \omega^4 & \omega^2 & -\omega^4 & -\omega^7 & -\omega^2 & -\omega^5 \\ \end{array} \right] = \left[ \begin{array}{c|c} F_4 & D_4 F_4 \\ \hline F_4 & - D_4 F_4 \\ \end{array} \right] $$where $D_4$ is a diagonal matrix with entries $\text{diag}(D_4) = [1, \omega_8, \omega_8^2, \omega_8^3] $ (the subscript on $\omega_8$ is to emphasize that it is $\omega_8 = \exp(-i 2 \pi /8) $) and $F_4$ is the DFT matrix with $n= 4$:
$$ F_{4} = \left[ \begin{array}{cccc} 1 & 1 & 1 & 1 \\ 1 & \omega^2 & \omega^4 & \omega^6 \\ 1 & \omega^4 & 1 & \omega^4 \\ 1 & \omega^6 & \omega^4 & \omega^2 \\ \end{array} \right] $$That is, up to a re-ordering of columns and a scaling matrix $D_4$, $F_8$ is made out of four copies of $F_4$. The matrix can be further factorized as:
$$ F_8P_8 = \left[ \begin{array}{c|c} F_4 & D_4 F_4 \\ \hline F_4 & - D_4 F_4 \\ \end{array} \right] = \left[ \begin{array}{c|c} I & D_4 \\ \hline I & - D_4 \\ \end{array} \right] \begin{bmatrix} F_4 & 0 \\ 0 & F_4 \end{bmatrix} $$
It follows that:
$$F_8 x = F_8P_8 P_8^{-1} x = \left[ \begin{array}{c|c} F_4 & D_4 F_4 \\ \hline F_4 & - D_4 F_4 \\ \end{array} \right] P_8^{T} x = \left[ \begin{array}{c|c} F_4 & D_4 F_4 \\ \hline F_4 & - D_4 F_4 \\ \end{array} \right] \begin{bmatrix} x_{0:2:7} \\ x_{1:2:7} \end{bmatrix} $$where $x_{0:2:7} = [x_0, x_2, x_4, x_6]^T$, $ x_{1:2:7} = [x_1, x_3, x_5, x_7]^T$. Thus
$$ F_8 x= \left[ \begin{array}{c|c} F_4 & D_4 F_4 \\ \hline F_4 & - D_4 F_4 \\ \end{array} \right] \begin{bmatrix} x_{1:2:8} \\ x_{2:2:8} \end{bmatrix} = \left[ \begin{array}{c} (F_4 x_{0:2:7}) + D_4 ( F_4 x_{1:2:7} ) \\ (F_4 x_{0:2:7}) - D_4 ( F_4 x_{1:2:7} ) \end{array} \right] $$The argument generalizes to any n (divisible by 2), and thus we have:
$$ F_n P_n = \left[ \begin{array}{c|c} I & D_{n/2} \\ \hline I & - D_{n/2} \\ \end{array} \right] \begin{bmatrix} F_{n/2} & 0 \\ 0 & F_{n/2} \end{bmatrix} $$where the diagonal entries of $D_n$ are : $[ 1, \omega_n, \omega_n^2, \omega_n^3 \dots \omega_n^{n/2-1}]$. The idea of the FFT is to use this formula recursively to divide-and-conquer the computation.
At the top of the tree, we seek a DFT of size $n$ which we "divide-and-conquer" into two DFTs of size $n/2$, we then scale and add those vectors which takes $8n$ operations. At next level, where we compute two $F_{n/2}$, we do $8(n/2)$ work to scale/add vectors, thus $8 (n/2) \times 2 = 8n$ total work. Recursively, we see that we do $8n$ work at each level of the tree. After $\log_2(n) = d$ rounds of divide and conquer, we hit the base case of $n =1$, for which $F_1x = x$. Summing up the work of each level, we thus do $ ~ 8n d = 8n \log_2 n$ work.
Write a recursive algorithm for the DFT assuming $n = 2^d$.
import numpy as np
def my_fft(x):
n = x.size
## ADD YOUR CODE HERE
return x*0
## Code below is provided for you. No need to change it.
x = np.random.randn(2**10) + np.random.randn(2**10) * 1j
y = np.fft.fft(x)
# Should be close to machine epsilon. ~1e14 is fine
print(np.linalg.norm(y - my_fft(x)) / np.linalg.norm(y))
The Chebyshev polynomials are defined from the three-term recurrence relation:
$$T_0(x) =1, \quad T_1(x) = x, \quad T_{n+1}(x) = 2xT_n(x) - T_{n-1}(x) $$The Chebyshev polynomials are the numerical analyst's favorite polynomials because of all of their wonderful properties. For example, it can be shown that they have an astonishing closed form: $T_n(x)= \cos(n\arccos(x))$. Using this property, it is straight-forward to see that the roots of the Chebyshev polynomial $T_n(x)$ are:
$$ x_i = \cos \left( \frac{2i -1 }{2n}\pi \right), \quad i = 1, \dots n $$These points are known as the Chebyshev nodes (of the first kind).
Using the three-term recurence, implement an algorithm to evaluate the sum $p_n(x) = \sum_{k=0}^{n} c_k T_k(x) $ in $\mathcal{O}(n)$ operations. This method is called Clenshaw's algorithm and is applicable to any polynomials with a three-term recurrence.
import numpy as np
def clenshaw(c, x):
# Computes \sum_{i=0}_{n} c_i T_i(x)
# where c is a (n+1) sized vector and T_i(x) is the i-th Chebyshev polynomial
# x is a scalar
n = c.size - 1
sum =0
## ADD YOUR CODE HERE
return sum
## Code below is provided for you. No need to change it.
# Test
c = np.random.randn(10)
x = 0.1
# Compare to built in method
print('error :', np.abs(np.polynomial.chebyshev.chebval(x, c) - clenshaw(c, x)))
Write a function that computes the chebyshev expansion of the Chebyshev interpolant of a function $f(x)$. That is, given an input vector $b\in \mathbb{R}^{n+1}$ with $b_i = f(x_i)$ where $x_i = \cos(\frac{2i -1 }{2(n+1)}\pi)$ for $i = 1, \dots (n+1)$ , compute the output vector $c \in \mathbb{R}^{n+1} $ so that the polynomial $p_n(x) = \sum_{i=0}^n c_i T_i(x) $ satisfies $p_n(x_i) = f(x_i)$.
Write this problem as the solution of a linear system, similar to the Vandermonde matrix approach discussed in class. Use the three term recurrence to fill the entries of the matrix. There is an $\mathcal{O}(n \log n)$ algorithm to perform this computation using the FFT, but we won't worry about it, an $\mathcal{O}(n^3)$ algorithm is fine for this exercise.
def fit_chebyshev_pol(f):
# f is a vector of size (n+1)
# returns the (n+1) coefficients of the n degree interpolant in the chebyshev basis
n = f.size-1
## ADD YOUR CODE HERE
return f*0
## Code below is provided for you. No need to change it.
# A Helper function to get n roots of the T_n(x)
def get_cheb_nodes(n):
return np.cos((2 * np.arange(1, n+1) - 1)/( 2* n) * np.pi )
n = 10
f_fn = lambda x : np.cos(3*np.pi*x) * np.exp(x)
x = get_cheb_nodes(n)
f = f_fn(x)
c = fit_chebyshev_pol(f)
# Evaluate the interpolant. Should get the values back.
print(np.linalg.norm(np.polynomial.chebyshev.chebval(x, c) - f) )
In class, we defined the Newton-cotes quadrature as the integral of the equispaced interpolant. Similarly, Clenshaw-curtis quadrature is the integral of the Chebyshev interpolant.
$$ \int_{-1}^1 f(x) dx \approx \int_{-1}^1 p_n(x) dx = \sum_{i=0}^n w_i f(x_i) $$Using a similar idea as 2, and the following identity:
$$ \int_{-1}^1 T_n(x)\, \mathrm{d}x = \begin{cases} \frac{(-1)^n + 1}{1 - n^2} & \text{ if }~ n \ne 1 \\ 0 & \text{ if }~ n = 1. \end{cases} $$write a function that computes the Quadrature weights. That is, find a vector $w\in \mathbb{R}^{n+1}$ so that
$$ \int_{-1}^1 p_n(x) dx = \sum_{i=0}^n w_i f(x_i) = w^T b $$where $x_i$ are the Chebyshev nodes of degree $(n+1)$.
Hint: first write $\int_{-1}^1 p_n(x) dx = c^T v$ where $c_i$ are the expansion coefficients for some fixed vector $v$,then relate $c$ to $b$ to infer $w$ such that $c^T v = w^T b$.
def get_Clenshaw_Curtis_weights(n):
# returns a vector of (n+1) weights of the n-th order Clenshaw-curtis quadrature:
# ADD YOUR CODE HERE
return np.zeros(n+1)
## Code below is provided for you. No need to change it.
n = 4
w = get_Clenshaw_Curtis_weights(n)
cc_w = np.array([0.16778123, 0.5255521 , 0.61333333, 0.5255521 , 0.16778123])
print('error (should be ~1e-8):', np.linalg.norm( w - cc_w))
Use your quadrature rule to estimate the integral of the smooth function:
$$ \int_{-1}^1 \cos(3 \pi x) \exp(x) dx $$Make a plot displaying the spectral convergence of the method. The analytical value of the integral is given below so that you may compute the error.
import matplotlib.pyplot as plt
f_fn = lambda x : np.cos(3*np.pi*x) * np.exp(x)
# The analytical value of the integral
analytical_integral = (1 - np.exp(2) )/( np.exp(1) + 9 * np.exp(1) * np.pi**2)
nns = np.arange(1,50)
errors = np.zeros(nns.size)
for n_idx,n in enumerate(nns):
# ADD YOUR CODE HERE
errors[n_idx]
plt.semilogy(nns,errors)
In this exercise, we approximate solutions by finite difference of elliptic PDEs of the form:
$$ - \frac{d^2}{dx^2} u(x,y) - \frac{d^2}{dy^2}u(x,y) + c(x,y)u(x,y) = f(x,y)\\ \qquad u(x,0) = u(x,1) =0\\ \qquad u(0,y) = u(1,y) = 0 $$where $c(x,y) \geq 0$. Last HW, we solved Poisson's equation on the square with homogeneous Dirichlet conditions:
$$ - \frac{d^2}{dx^2} u(x,y) - \frac{d^2}{dy^2}u(x,y) = f(x,y)\\ \qquad u(x,0) = u(x,1) =0\\ \qquad u(0,y) = u(1,y) = 0 $$That is the equation above with $c(x,y) = 0$. We saw that the Poisson PDE may be discretized as:
$$ TU + UT = F $$where $T$ is the tridiagonal second-order finite difference matrix, and $F_{i,j} = h^2 f(x_i, y_j)$ with $x_i = i h $, $y_j = jh$, and $h = \frac{1}{n+1}$. To get a discretization of the above system, only a small modification is needed, we need to discretize the extra operator $u \rightarrow c u $. Since $c(x,y)$ and $u(x,y)$ are discretized on a grid and $(c u) (x_i,y_j ) = c(x_i,y_j)u(x_i, y_j)$, this operator is represented as:
$$ C \cdot U $$where $\cdot$ is a pointwise multiplication between the two matrices (also called the Hadamard product), and C is the matrix $C_{i,j} = h^2 c(x_i, y_j)$.
Thus, the discretized system is now:
$$ TU + UT + C\cdot U = F $$We were able to solve Poisson's equation in only $\mathcal{O}(n^2 \log n)$ by observing it is a Sylvester equation, using the Bartels-Stewart algorithm and using our knowledge of the diagonalization of $T$. This new matrix equation is not a Sylvester equation, thus our tricks do not apply here.
We are left with our knowledge of the Kronecker formalism from HW3. Since $C\cdot U$ is nothing but pointwise multiplication, it can be equivalently written in the vectorized form $ \text{vec}(C) \cdot \text{vec}(F)$, or written as $D \text{vec}(U)$ where $D = \text{diag}(\text{vec}(C))$ is the $\mathbb{R}^{n^2 \times n^2}$ the diagonal matrix with $\text{vec}(C)$ on its diagonal.
Thus,
$$ TU + UT + C \cdot U = F \Leftrightarrow \left( (I \otimes T) + (T \otimes I ) + D \right) \text{vec}(U) = \text{vec}(F) $$One way to solve this is to form this $A = \left( (I \otimes T) + (T \otimes I ) + D \right)$ matrix and do a dense linear solve. This would take $\mathcal{O}(n^6)$ operations which is prohibitive in most cases. However, we have seen that we can take matrix-vector products with matrices like $A$ fast, so this linear system is an ideal candidate for Krylov subspace methods. Since $T$ is SPD, $A$ is SPD as well, therefore we can apply the conjugate gradient (CG) method.
Implement a function that computes the matrix-vector product with a vectorized version of $Ax$ in $\mathcal{O(n^2)}$ operations. Note that A is of size $n^2 \times n^2$.
Using the built-in CG iteration, run CG for 20 iterations, asking to stop if tolerance tol
drops below $10^{-12}$. You can run CG using the following command
xk,info = scipy.sparse.linalg.cg(A_op, b, maxiter=maxiter, tol = tol)
.
Compute the residual of xk
. What do you observe? How close is it to machine epsilon?
Note that we take $n = 1000$, so A has one million rows and columns: no way we could do a dense solve (a back of the envelope calculation estimates it would take two year of compute time!).
import scipy
import scipy.sparse
import matplotlib.pyplot as plt
import scipy.sparse.linalg
import numpy as np
def diff_eq_mat_vec(C, x):
# Implement the Ax in O(n^2) where x in an n^2 vector and
# A is the matricized version of the operator
# X -> TX + XT + C dot X
n = C.shape[0]
# ADD YOUR CODE HERE
return 0*C
# THERE IS CODE TO ADD BELOW!
# Helper functions for you to use
# Vectorization of matrix
def vec(X):
return X.T.reshape(-1)
## Inverse of vec function.
def unvec(x):
n = np.sqrt(x.size).astype(int)
return x.reshape(n,n).T
# returns the T matrix in sparse format.
# T@x is a O(nnz) operations where nnz is the number of non zeros.
# Since T as ~3n non zeros, T@x is an O(n) operation
def get_sparse_T(n):
T = scipy.sparse.diags([np.ones(n-1), -2*np.ones(n), np.ones(n-1) ], offsets = [-1, 0, 1])#, n =n,m=n )
return T
## A test for the MATVEC.
n = 3
C = np.linspace(1.2, 1.5, n**2).reshape(n,n)
x = np.linspace(1, 4, n**2)
xk = diff_eq_mat_vec(C,x)
precomp_xk = np.array([ 0.7, 1.5546875, -0.25625, 0.8796875, 3.375 ,
0.9546875, -3.10625 , 0.2796875, -3.5 ])
print('error in mat_vec: (should be ~1e-15)', np.linalg.norm(xk -precomp_xk ))
# Defining the coefficients of a PDE
c_fun = lambda x,y : np.sin(3*np.pi * x * y) * y
f_fun = lambda x,y : np.exp( -( x**2 + y**2)) * ( np.cos(10*np.pi * y) + np.sin(4*x)**2)
# Discretization size
n = 1000 # If your computer struggles with n = 300, if not, try n =3000!
# Generate the grid
xgrid = np.linspace(0, 1, n)
xx, yy = np.meshgrid(xgrid, xgrid)
#Evaluate c_fun on grid.
C = c_fun(xx,yy) * (1/(n+1)**2)
F = f_fun(xx,yy) * (1/(n+1)**2)
b = vec(F)
# To run CG, we need to pass to scipy a blackbox A_mv(x) that implements A*x
# Below the required syntax. You can look up the documentation,
# but it isn't important, nothing very interesting is happening
A_matvec = lambda x : diff_eq_mat_vec(C, x)
A_op = scipy.sparse.linalg.LinearOperator((n**2,n**2), A_matvec)
# Then A_op is passed to CG
## ADD YOUR CODE HERE -
## CALL CG AND COMPUTE THE RESIDUAL OF APPROXIMATE SOLUTION
xk,info = scipy.sparse.linalg.cg()
# Compute the relative residual |A*x - b| / |b|
res = np.inf
# Should be pretty bad! (about ~= 6)
print('residual of linear system: ', res)
# Plots approximate solution
plt.imshow(unvec(xk))
plt.title('computed approximate solution of PDE')
plt.colorbar()
The slow convergence behavior above is typical when solving linear systems obtained by discretizing differential equations. Recall the CG convergence bounds that
$$ \frac{\|e_k\|_A}{\|e_0\|_A} \leq 2 \left( \frac{ \sqrt{\kappa(\mathbf{A})}-1 }{ \sqrt{\kappa(\mathbf{A})}+1 } \right)^k \approx 2 \left( 1 - \frac{2}{\sqrt{\kappa(\mathbf{A})}} \right)^k \, $$where $K_2(A)$ is the condition number of A, thus when $K_2(A)$ is large, the convergence may be very slow. Also recall that when A is SPD: $$K_2(A) = \|A\|_2 \|A^{-1}\|_2 = \frac{ \sigma_{max}(A)}{\sigma_{min}(A)} = \frac{ \lambda_{max}(A)}{\lambda_{min}(A)} $$
The discretization by finite difference of second-order differential operator (such as the ones here) have condition number that grows proportionally to $n^2$: $K_2(A) = \mathcal{O}(n^2)$ so we expect very poor convergence for even moderately large $n$.
We are missing one last ingredient in our PDE solver: a preconditioner. Preconditioning is the idea of replacing a linear system $Ax=b$ with a different, but equivalent linear system $M^{-1}Ax = M^{-1}b$. The matrix $M$ is called the preconditioner. A good preconditioner should have two properties:
Preconditioners are highly problem specific. They are often the special sauce that, together with a Krylov method, makes a modern numerical method blazingly fast. As such, they are the basis for much research interest.
There is one more detail we have ignored so far: while $M^{-1}A$ may be well conditioned, it is not SPD even if $A$ and $M$ are SPD, thus it does not make sense to apply CG directly to $M^{-1}A$ since CG only applies to SPD matrices. Instead, one could run CG on $L^{-1}A(L^{-1})^T$ where $M = LL^T$ a Cholesky decomposition of $M$ (assuming it is SPD). Luckily, the entire CG iteration applied to $L^{-1}A(L^{-1})^T$ may be rewritten as an iteration that only involves $M^{-1} = (L^{-1})^{T}L^{-1} $ and $A$, thus the Cholesky decomposition is unnecessary. If you are interested, you may read the details in these notes. For this exercise, it is sufficient to know that having two black boxes is sufficient to run the CG iteration: one to compute $Ax$ and one to compute $M^{-1}x$. Furthermore, there exist a similarity transformation: $M^{-1} A = (L^{-1})^T \left( L^{-1}A(L^{-1})^T \right) L^T $ between $L^{-1}A(L^{-1})^T$ and $M^{-1}A$, they have the same eigenvalues.
Often, the trick to finding a good preconditioner is to find a "nearby" but easier problem. In our case, we can pick $M$ to be the discretization of Poisson's equation: $M = T \otimes I + I \otimes T$. In this case, we have that $A = M + D$, where $D=\text{diag}(\text{vec}(C))$, and one can show that $L^{-1}A(L^{-1})^T$ is well conditioned.
Proving this fact is your next task. Since we know the exact eigenvalue decomposition of $T$ (see previous HW) it is easy to bound eigenvalues of $M = T \otimes I + I \otimes T$.
$$ \lambda_{\text{min}}(M) \approx \frac{2 \pi^2}{(n+1)^2}, \quad \lambda_{\text{max}}(M) = 8 $$Using these estimates show that $L^{-1}A(L^{-1})^T$ is well conditioned, specifically, show that:
$$ K_2( L^{-1}A(L^{-1})^T ) \leq 1 + \frac{1}{2\pi^2}\max_{x,y}{c(x,y)} \ .$$(You may use the $\approx$ in $\lambda_{\text{min}}(M)$ as equality). Note that the bound is independent of $n$ and small if $c(x,y)$ doesn't get too large, so we expect CG to converge very quickly.
Hint: You may want to use the following characterization of the eigenvalues of symmetric matrices based on the Rayleigh quotient:
$$ \lambda_{max}(B) = \max_{x \in \mathbb{R}^n} \frac{x^TBx}{x^Tx} \ , \\ \lambda_{min}(B) = \min_{x \in \mathbb{R}^n} \frac{x^TBx}{x^Tx} \ , $$also recall that $c(x,y) \geq 0$, $C_{i,j} = h^2 c(x_i,y_j) $, and that the eigenvalues of the shifted matrix $A + I$ are $1 + $ the eigenvalues of the matrix $A$.
Implement your preconditioner in my_poisson_preconditioner
(based on HW4). Then pass it as an input to CG using the M
option:
scipy.sparse.linalg.cg(A_op, b, maxiter =20, tol = 1e-12, M= M_op)
. Compute the residual once more. What do you observe?
def my_poisson_preconditioner(b):
## ADD YOUR CODE HERE
return b*0
## Helper functions you may want to use
# A helper function you can use if you want
def dst(X,axis =0):
return scipy.fft.dst(X, type = 1, norm = "ortho", axis =axis)
# Computes eigenvalues of the Tridiagonal matrix from known analytical formula
def get_eigenvalues_of_T(n):
return 4 * (np.sin(np.arange(1,n+1) * np.pi / (2*( n + 1)))**2)
# Object that scipy expects.
M_op = scipy.sparse.linalg.LinearOperator((n**2,n**2), my_poisson_preconditioner)
## ADD YOUR CODE HERE - CALL CG AND THEN COMPUTE THE RESIDUAL
# CG iteration
xk,info = scipy.sparse.linalg.cg()
#Compute residual - should be close to prescribed tolerance
residual = np.inf
print('residual: ', residual)
# Plots approximate solution
plt.imshow(unvec(xk))
plt.title('approximate solution of PDE')
plt.colorbar()
Much better convergence! Note that our first approximate solution looks visually nothing like our correct one.
There is a lot more to be said about numerical solvers for PDEs such as handling more complicated domains, handling coupled PDEs and dealing with nonlinearities, but we have now seen the entire lifespan of a typical linear PDE solver:
In class, we proved the convergence of Krylov subspace methods through a connection with a polynomial approximation problem. This link between numerical linear algebra and approximation theory is pervasive, and we see another example here with a different iterative solver: the Alternating Direction Implicit (ADI) method. ADI is an iterative solver for Sylvester equations:
$$ AX - XB = F $$It is defined as the following iteration:
$X^{(0)} = 0$
for $j=0,1,2,3,\dots$
Solve for $X^{(j + 1/2)}$, where $\left( A - \beta_{j +1} I\right) X^{(j+1/2)} = X^{(j)}\left( B - \beta_{j + 1} I \right) + F$
Solve for $ X^{(j + 1)}$, where $ X^{(j+1)}\left( B - \alpha_{j + 1} I \right) = \left( A - \alpha_{j+1} I\right) X^{(j+1/2)} - F$
The scalars $\alpha_j$ and $\beta_j$ are hyperparameters that must be chosen a priori.
where $X$ is the solution to the Sylvester equation.
Hint: Note that $F = AX - XB + (- \beta_{j+1} X + \beta_{j+1} X) = (A - \beta_{j +1} I)X - X\left( B - \beta_{j + 1} I \right) $.
Let $E^{(j)} = X - X^{(j)}$. First show that
$$E^{(j + 1/2)} = \left( A - \beta_{j +1} I\right)^{-1} E^{(j)}\left( B - \beta_{j + 1} I \right) \ , $$then derive a similar relation between $E^{(j + 1)}$ and $E^{(j + 1/2)}$ and combine them to show the identity.
where $r_K(x)$ is the rational function (a rational function is a ratio of polynomials): $$r_K(x) = \prod_{j=1}^K \frac{x - \alpha _{j}}{x - \beta _{j}} \ . $$ The function applied to the matrix is defined as: $r_K(M) = \prod_{j=1}^{K}{ {(M-\alpha _{j}I)(M-\beta _{j}I)^{-1}} }$
where $a = \lambda_{\text{min}}(A) $, $b = \lambda_{\text{max}}(A) $, $c = \lambda_{\text{min}}(-B) $, $d = \lambda_{\text{max}}(-B) $
Thus, one should pick the shifts $ \alpha_j$ and $\beta_j$ such that the right-hand side is as small as possible. In other words, $r_K(x)$ must solve the following rational function approximation problem:
The approximation problem is known as Zolotarev's third problem. A solution is known in the case that $a = d$, and $c = b$, which corresponds to the case when $B = -A$. The optimal $\alpha_j,\beta_j$ are not easy to write down, but for our purposes, it is sufficient to know they exist, and that
$$\gamma(K, a, b,a,b) =4 \left[\exp\left( \frac{\pi^2}{2 \log (4b/a)} \right) \right]^{-2K} \ .$$Use this result to prove that the error in ADI satisfies $\|E^{(K)}\| = \mathcal{O}(\rho^K)$ for some $\rho <1$ (that is, it converges at least linearly in K) when $B = -A$, $A$ is SPD, and the optimal shifts are used.
Extra credit: