MAT321: Precept 4¶

Problem 1 : Weighted Least Squares¶

Recall $W\in\mathbb{R}^{m\times m}$ is symmetric positive definite (SPD) if $W=W^\top$ and $x^\top W x>0$ for all $x\neq 0$.

If $W$ is SPD, it defines the weighted norm $\|y\|_{W}:=\sqrt{y^\top W y}$. Moreover, $W$ admits a unique SPD square root $W^{1/2}$ with $W=(W^{1/2})^2$.

For $A\in\mathbb{R}^{m\times n}$ and $b\in\mathbb{R}^m$, consider the weighted least-squares problem:

$$ \min_{x\in\mathbb{R}^n}\ \|Ax-b\|_{W}^{2}. $$

Show that the normal equations are

$$ A^\top W A\,x \;=\; A^\top W b, $$

and hence, when $A$ has full column rank,

$$ x^\star \;=\; (A^\top W A)^{-1}A^\top W b. $$

Problem 2 : Householder reflector¶

Let $v\in\mathbb{R}^{m}\setminus\{0\}$ and define

$$ H \;:=\; I \;-\; 2\,\frac{v v^{\top}}{v^{\top}v}. $$

Show that $H$ is symmetric, orthogonal and involutory ($H^2 = I$)

Problem 3 : Block Householder¶

Let $X,Y\in\mathbb{R}^{m\times k}$ have orthonormal columns ($X^\top X=Y^\top Y=I_k$). Set $D:=X-Y\in\mathbb{R}^{m\times k}$. Consider block Householder reflectors

$$ H \;=\; I-2UU^\top,\qquad U\in\mathbb{R}^{m\times r},\ \ U^\top U=I_r, $$

and the requirement $HX=Y$.

(A) Necessary condition on $U$.¶

Show that if $HX=Y$, then $\operatorname{range}(D)\subseteq \operatorname{range}(U)$.

(B) Checks.¶

Verify $H$ is symmetric and orthogonal

(C) QR connection¶

Take a thin QR of $D$ with rank $r\le k$:

$$ D \;=\; Q R,\qquad Q\in\mathbb{R}^{m\times r},\ Q^\top Q=I_r,\ R\in\mathbb{R}^{r\times k}, $$

and define $H:=I-2QQ^\top$.

Assume $Q^\top(X+Y)=0$ (this is equivalent to $X^\top Y$ being symmetric). Show that $HX=Y$.

Programming experiment¶

In this precept we perform experiments similar to the paper "Experiments on Gram-Schmidt Orthogonalization" by J.R. Rice in Math. Comp. (1966). Consider the following excerpt from the text:

gs1

Task 1: Implement an economy QR factorization by classical Gram-Schmidt (Alg. 7.1 in Trefethen) and modified Gram-Schmidt (Alg 8.1 in Trefethen) in the functions below.

Test your code on several examples (provided below in code)

  • A matrix $A$ with random normal entries of dimension $$ \text{size}(A) = 300 \times 200 $$
  • A matrix $A$ with condition number $10^6$ of dimension $$ \text{size}(A) = 300 \times 200 $$
  • An artificial example from J.R. Rice's paper: (Don't worry about MGS-Pivot) gs3

We will test for the stability of both methods by looking at two error metrics (both are implemented for you below):

  • error 1: The error in the orthogonality of the vectors by computing $$ \text{max\_err} = \max_{i \not = j} |\langle q_i, q_j \rangle| $$
  • error 2: The error in how well $A$ is preserved by the projection onto the subspace defined by $Q$.

$$ \text{err} = \frac{\|Q Q^\top A - A\|_F}{\|A\|_F} $$ where $\|\cdot\|_F$ is the Frobenius norm, which is equal to the square root of the sum of squares of the entries of the given matrix $ \|A\|_F = \sqrt{\sum_{i=1}^n \sum_{j=1}^m |a_{i,j}|^2} $. Note that, in exact arithmetic, if $A\in \mathbb{R}^{m\times n}$ with $A = QR$ and $Q\in \mathbb{R}^{m\times n}$ has orthonormal columns:

$$ Q Q^TA = QQ^T (QR) = Q (Q^TQ) R = QI_{n,n}R = QR = A $$ where $I_{n,n}$ is the $n \times n$ identity matrix.

In [ ]:
import numpy as np

def classical_gram_schmidt(A):
    # Implement this function
    return Q,R


def modified_gram_schmidt(A):
    # Implement this function
    return Q, R


# These are done for you.
def error1(Q):
    n = Q.shape[1]
    err = np.max(np.abs(Q.T @ Q - np.eye(n)))
    return err

def error2(Q,A):
    err = np.linalg.norm(Q @Q.T@A - A, 'fro')
    return err / np.linalg.norm(A, 'fro')
In [ ]:
## Test 1:

# Test with a random matrix
m,n = 300,200
# Computes an economy SVD
A = np.random.randn(m,n)
print('condition number of A: ' + "{:1.1e}".format(np.linalg.cond(A))  )

Qc, Rc = classical_gram_schmidt(A)
Qm, Rm = modified_gram_schmidt(A)
print('classical G-S error: ', error1(Qc))
print('modified G-S error: ', error1(Qm))

print('classical G-S error: ', error2(Qc,A))
print('modified G-S error: ', error2(Qm,A))
In [ ]:
## Test 2: Ill conditioned

# Generate a Matrix with condition number 10^6
m,n = 300,200

# Computes an economy SVD
U, _,Vt = np.linalg.svd(np.random.randn(m,n), full_matrices = False)
# Set eigenvalues
S = np.diag(10**(np.linspace(0,6,n)))
A = U@S@Vt
print('conditioner number of A: ' + "{:1.1e}".format(np.linalg.cond(A))  )


Qc, Rc = classical_gram_schmidt(A)
Qm, Rm = modified_gram_schmidt(A)
print('classical G-S error: ', error1(Qc))
print('modified G-S error: ', error1(Qm))

print('classical G-S error: ', error2(Qc,A))
print('modified G-S error: ', error2(Qm,A))
In [ ]:
# Test 3: Artificial example from the paper (Warning: result should be very bad!)
A = np.eye(n,n)
v = (3/4)**(np.arange(n))
A[:,3:] = A[:,2:n-1]
A[:,2] = v
print('conditioner number of A: ' + "{:1.1e}".format(np.linalg.cond(A))  )

Qc, Rc = classical_gram_schmidt(A)
Qm, Rm = modified_gram_schmidt(A)
print('classical G-S error: ', error1(Qc))
print('modified G-S error: ', error1(Qm))

print('classical G-S error: ', error2(Qc,A))
print('modified G-S error: ', error2(Qm,A))

Task 2: Implement a full QR by Householder reflection and test on the artificial example¶

If you are running out of time, use the function as it is implemented for you two cells down. What do you observe compared to Gram-Schmidt?

In [ ]:
# Time permitting
# Implement QR by Householder
def householder_triangularization(A):
    _,n = A.shape
    R = np.copy(A) # Store temporary vectors into Q.
    V = [] # Store v's into this list (or some other structure if you like)
    # Implement Housholder here
    return V, R

def apply_Q_to_vector(V, x):
    # Computes b = Q @ x, where Q is defined implicitely by the Householder reflectors in V
    return b

def form_Q_matrix(V):
    m = V[0].size
    I = np.eye(m)
    Q = np.zeros((m,m))
    for k in range(m):
        Q[:,k] = apply_Q_to_vector(V,I[:,k])
    return Q

A = np.eye(n,n)
v = (3/4)**(np.arange(n))
A[:,3:] = A[:,2:n-1]
A[:,2] = v


V, R = householder_triangularization(A)
Qh = form_Q_matrix(V)

print('condition number of A: ' + "{:1.1e}".format(np.linalg.cond(A))  )
print('Householder error: ', error1(Qh))
print('Householder error: ', error2(Qh,A))
print('error: ', np.linalg.norm(Qh@R - A)/ np.linalg.norm(A))
In [ ]:
# If you don't have time to implement Householder, just np.linalg.qr which implements QR by Householder

help(np.linalg.qr)

A = np.eye(n,n)
v = (3/4)**(np.arange(n))
A[:,3:] = A[:,2:n-1]
A[:,2] = v


print('condition number of A: ' + "{:1.1e}".format(np.linalg.cond(A))  )
print('Householder error: ', error1(Qh))
print('Householder error: ', error2(Qh,A))
print('|QR - A| error: ', np.linalg.norm(Qh@R - A)/ np.linalg.norm(A))