The QR Decomposition#

Orthogonal matrices#

Let \(Q \in \mathbb{R}^{n \times n}\). The matrix \(Q\) is called orthogonal if \(Q^TQ = I\), where \(I\) is the identity matrix. This definition implies that all rows and columns of \(Q\) are of unit length and are mutually orthogonal.

The significance of orthogonal transformations lies in their ability to preserve angles and the 2-norm. To understand this, consider the standard Euclidean inner product, defined as \(\langle x, y \rangle := y^Tx\) for vectors \(x, y \in \mathbb{R}^n\).

Now, observe the effect of an orthogonal transformation \(Q\) on this inner product:

\[ \langle Qx, Qy \rangle = (Qx)^T(Qy) = x^TQ^TQy = x^TIy = \langle x, y \rangle. \]

This equation shows that the inner product (and thus the angle) between \(x\) and \(y\) is preserved after applying the orthogonal transformation \(Q\).

Furthermore, the preservation of the 2-norm under orthogonal transformations can be seen as follows:

\[ \|Qx\|_2^2 = \langle Qx, Qx \rangle = x^TQ^TQx = x^Tx = \|x\|_2^2. \]

This demonstrates that a vector’s 2-norm (or length) is unchanged when multiplied by an orthogonal matrix.

QR Decomposition via Gram-Schmidt Orthogonalisation#

Consider a matrix \(A \in \mathbb{R}^{m \times n}\) with \(m \geq n\). The aim is to orthogonalise the columns of \(A\). This process is achieved through the Gram-Schmidt orthogonalisation method.

Algorithm 1 (Gram-Schmidt Orthogonalisation)

Inputs \(A \in \mathbb{R}^{m \times n}\)

Outputs \(Q, R \in \mathbb{R}^{m \times n}\)

  • Let \(\tilde{q}_1\) be the first column \(a_1\) of \(A\) and normalize \(\tilde{q}_1\) to obtain \(q_1\), defined as:

    \[\begin{split} \begin{aligned} \tilde{q}_1 &:= a_1,\\ q_1 &:= \frac{\tilde{q}_1}{\|\tilde{q}_1\|_2} \end{aligned} \end{split}\]
  • For \(k\) in \(\{2, \ldots, n\}\):

    • To compute column \(q_k\), orthogonalize column \(a_k\) against all previous \(q_j\) (for \(j = 1, \ldots, k-1\)):

    \[\begin{split} \begin{aligned} \tilde{q}_k &:= a_k - \sum_{j=1}^{k-1}\langle a_k, q_j \rangle q_j,\\ q_k &:= \frac{\tilde{q}_k}{\|\tilde{q}_k\|_2}. \end{aligned} \end{split}\]

Matrix Decomposition Representation#

The Gram-Schmidt procedure can be expressed as a matrix decomposition:

  • For each column \(a_k\) of \(A\), we have:

    \[ a_k = \|\tilde{q}_k\|_2 q_k + \sum_{j=1}^{k-1}\langle a_k, q_j \rangle q_j \]
  • Hence, the matrix \(A\) can be decomposed as \(A = QR\) where:

    • \(Q = \begin{bmatrix} q_1, \ldots, q_n \end{bmatrix}\) is a matrix with orthogonal columns.

    • \(R\) is an upper triangular matrix defined as:

      \[\begin{split} R = \begin{bmatrix} \|\tilde{q}_1\|_2 & \langle a_2, q_1 \rangle & \ldots & \langle a_n, q_1 \rangle \\ & \|\tilde{q}_2\|_2 & \ddots & \vdots \\ & & \ddots & \langle a_n, q_{n-1} \rangle \\ & & & \|\tilde{q}_n\|_2 \end{bmatrix} \end{split}\]

Fact: QR Decomposition

Given \(A \in \mathbb{R}^{m \times n}\) with \(m \geq n\), a QR decomposition of \(A = QR\) exists, where \(Q\) is an \(m \times n\) orthogonal matrix satisfying \(Q^TQ = I\) and where \(R\) is an \(n \times n\) upper triangular matrix. If \(A\) has a full rank, then the QR decomposition is unique if we require the diagonal elements of \(R\) to be positive.

Solving linear systems of equations with the QR decomposition#

To solve \(Ax = b\), we can employ the QR decomposition of \(A \in \mathbb{R}^{n \times n}\).

The linear system \(Ax = b\) can be rewritten using the QR decomposition:

  1. Original System: \(Ax = b\).

  2. Substituting \(A = QR\): \(QRx = b\).

  3. Left-multiplying by \(Q^T\) (since \(Q\) is orthogonal, \(Q^TQ = I\)): \(Q^TQRx = Q^Tb\).

  4. Simplifying (as \(Q^TQ = I\)): \(Rx = Q^Tb\).

The last equation, \(Rx = Q^Tb\), represents an upper triangular system. This can be efficiently solved by backward substitution, starting from the previous equation and substituting backwards to find all unknowns.

  • Backward Stability: Solving linear systems via QR decomposition is known to be backward stable. Backward stability refers to the property where the computed solution is the exact solution to a slightly perturbed version of the original problem. This stability is particularly beneficial when dealing with ill-conditioned matrices or when high precision is required.

  • Comparison with LU Decomposition: Despite its stability, the QR decomposition method for solving linear systems is less commonly used than the LU decomposition in practice. This is primarily due to its computational cost. The QR decomposition typically requires about twice as many operations as the LU decomposition, making it less efficient for large-scale problems.

Reduced vs full QR decomposition#

Consider the QR decomposition \(A = QR\) with \(Q\in\mathbb{R}^{m\times n}\) and \(R\in\mathbb{R}^{n\times n}\). We call this QR decomposition a reduced QR decomposition. To define the full QR decomposition, let \(Q^{\bot}\in\mathbb{R}^{m\times m-n}\) be a matrix whose columns are orthonormal and satisfy \(\hat{Q}^TQ = 0\) (i.e. they are a basis of the orthogonal complement of the range of \(A\)). We can reformulate the QR decomposition as

\[\begin{split} A = \begin{bmatrix}Q & Q^{\bot}\end{bmatrix}\begin{bmatrix}R\\ 0\end{bmatrix} =: \tilde{R}\tilde{Q}. \end{split}\]

The QR decomposition \(A = \tilde{Q}\tilde{R}\) is called full QR decomposition. It is rarely used in practical computation as the columns of \(Q^{\bot}\) are usually not required. However, it is essential theoretically, as \(Q^{\bot}\) contains information about the range of \(A\) and its orthogonal complement.

Python skills#

Implementation from scratch#

Step 1: Import Numpy

import numpy as np

Step 2: Implement the Gram-Schmidt method

def gram_schmidt(A):
    m, n = A.shape
    Q = np.zeros((m, n))
    R = np.zeros((n, n))

    for k in range(n):
        Q[:, k] = A[:, k]
        for i in range(k):
            R[i, k] = np.dot(Q[:, i].T, A[:, k])
            Q[:, k] -= R[i, k] * Q[:, i]
        R[k, k] = np.linalg.norm(Q[:, k])
        Q[:, k] /= R[k, k]
    return Q, R

Step 3: Test with the matrix of the first self-check question

# Define a matrix
A = np.array([[3, 2], [1, 4], [0, 5]])

# Perform QR Decomposition
Q, R = gram_schmidt(A)

print("Matrix Q:\n", Q)
print("Matrix R:\n", R)

Step 4: Verify the decomposition

We can verify the decomposition by checking if \(A \approx QR\).

# Check if A is approximately equal to QR
A_approx = np.dot(Q, R)
print("Matrix A reconstructed from Q and R:\n", A_approx)

Using Numpy’s implementation#

Here is an example using Numpy’s implementation:

import numpy as np

# Step 1: Create a random matrix
A = np.random.rand(3, 3)
print("Matrix A:")
print(A)

# Step 2: Perform QR decomposition
Q, R = np.linalg.qr(A)

# Step 3: Display results
print("Orthogonal Matrix Q:")
print(Q)
print("Upper Triangular Matrix R:")
print(R)

# Step 4: Verification
print("Product QR:")
print(np.dot(Q, R))

print("Product Q^TQ (should be close to identity):")
print(np.dot(Q.T, Q))

Self-check questions#

Question

Find the QR decomposition of the matrix

\[\begin{split} A = \begin{bmatrix} 3 & 2 \\ 1 & 4 \\ 0 & 5 \end{bmatrix}.\end{split}\]

using the Gram-Schmidt method.

Answer
  1. First Column (\(q_1\)):

\[\begin{split}q_1 = \begin{bmatrix} 0.9487 \\ 0.3162 \\ 0 \end{bmatrix}\end{split}\]
  1. Orthogonalized Second Column (\(\tilde{q}_2\)):

\[\begin{split}\tilde{q}_2 = \begin{bmatrix} -1 \\ 3 \\ 5 \end{bmatrix}\end{split}\]
  1. Normalized Second Column (\(q_2\)):

\[\begin{split}q_2 = \begin{bmatrix} -0.1690 \\ 0.5071 \\ 0.8452 \end{bmatrix}\end{split}\]
  1. Orthogonal Matrix \(Q\):

\[\begin{split}Q = \begin{bmatrix} 0.9487 & -0.1690 \\ 0.3162 & 0.5071 \\ 0 & 0.8452 \end{bmatrix}\end{split}\]
  1. Upper Triangular Matrix \(R\):

\[\begin{split}R = \begin{bmatrix} 3.1623 & 3.1623 \\ 0 & 5.9161 \end{bmatrix}\end{split}\]

The matrices \(Q\) and \(R\) form the QR decomposition of \(A\), such that \(A = QR\). The matrix \(Q\) contains orthonormal columns, and \(R\) is an upper triangular matrix.

Question

A matrix \(A\) is called tridiagonal if on its diagonal and first off-diagonals contain nonzero entries: \(a_{ij} = 0\) if \(|i - j| \geq 2\). Show that if \(A\) is tridiagonal, then \(R\) of its \(QR\) factorisation is nonzero only on the diagonal and the first two upper off-diagonals.

Answer

Recall the structure of \(R\):

\[\begin{split} R = \begin{bmatrix} \|\tilde{q}_1\|_2 & \langle a_2, q_1 \rangle & \ldots & \langle a_n, q_1 \rangle \\ & \|\tilde{q}_2\|_2 & \ddots & \vdots \\ & & \ddots & \langle a_n, q_{n-1} \rangle \\ & & & \|\tilde{q}_n\|_2 \end{bmatrix} \end{split}\]

We need to show \(r_{ij} = 0\) if \(j + 2 \geq i\):

  • The result trivially holds for the first three columns because of the triangularity of \(R\). Thus, let \(j \geq 4\).

  • We know that the first \(j - 2\) entries of the \(j\)th column of \(a_j\) vanish because of the tridiagonal structure.

  • We that \(\mathrm{span}(a_1, \ldots, a_i) = \mathrm{span}(q_1, \ldots, q_i)\) owing to the construction of the Gram-Schmid method. Because of the tridiagonal structure, only the \((i+1)\)th elements of \(q_k\), \(k \leq i\), can be nonzero.

  • Therefore if \(j + 2 \geq i\)

    \[ r_{ij} = \langle a_j, q_i \rangle = \sum_{k = 1}^n (a_j)_k, (q_i)_k = 0 \]

    because in each part of the sum, one factor equals \(0\).

Question

Given \(A \in \mathbb{R}^{n \times m}\) with \(m \geq n\) show that the factorisation \(A = LQ\) exists, where \(Q\) is an \(n \times m\) matrix with orthogonal rows and where \(L\) is an \(n \times n\) lower triangular matrix.

Answer

Using the \(QR\) decomposition \(A^T = \bar{Q} R\) of the transpose,

\[ A = (A^T)^T = (\bar{Q} R)^T = R^T \bar{Q}^T = L Q \]

with \(L := R^T\) and \(Q := \bar{Q}^T\).

Question

Determine the \(2\times 2\) matrix \(Q\) that rotates a vector by an angle \(\theta\). Is this matrix orthogonal? Show that \(Q^{-1}\) is identical to the matrix that rotates vectors by an angle of \(-\theta\).

Answer

The rotation matrix is given as follows:

\[\begin{split} Q = \begin{bmatrix} \cos\theta & -\sin\theta\\ \sin\theta & \cos\theta \end{bmatrix} \end{split}\]

The matrix \(Q\) is orthogonal since \(Q^TQ=I\). The rotation matrix for the angle \(-\theta\) is obtained as

\[\begin{split} \hat{Q} = \begin{bmatrix} \cos\theta & +\sin\theta\\ -\sin\theta & \cos\theta \end{bmatrix}, \end{split}\]

which is just the transpose of \(Q\). Hence, as expected, the inverse of \(Q\) is just the rotation by \(-\theta\).

Question

Let \(u\in\mathbb{R}^n\) with \(\|u\|_2=1\). Define \(P=uu^T\). Show that \(P=P^2\). Is \(P\) an orthogonal matrix? Describe what \(P\) is doing. Matrices that satisfy \(P=P^2\) are also called projectors.

Answer

We have

\[ P^2 = uu^Tuu^T = uu^T \]

since \(u^Tu = 1\).

\(P\) is not orthogonal since it is singular. If \(v\bot u\) then \(Pv = 0\). \(Px\) is the projection of a vector \(x\) along \(u\), that is it cancels out all components of \(x\) orthogonal to \(u\).

Question

Let \(P=P^2\) be a projector satisfying \(P=P^T\). Show that \(Q=I-2P\) is an orthogonal matrix. Give a geometric interpretation of \(Q\).

Answer

We have

\[ (I-2P)^T(I-2P) = (I -2P)^2 = I - 4P + 4P^2 = I \]

since \(P=P^T\) and \(P^2=P\). The matrix \(Q\) is a reflector. All components orthogonal to \(P\) are left untouched while all components in the range of \(P\) are subtracted twice, therefore being reflected. In the special case that \(P=uu^T\) for \(\|u\|_2=1\), the matrix \(Q\) is a Householder transformation.

Question

In the following, we define two different ways of orthogonalising vectors.

import numpy as np  
  
def gram_schmidt(A):  
    """Returns an orthogonal basis for the columns in A."""  
    m = A.shape[0]  
    n = A.shape[1]  
      
    Q = np.zeros((m, n), dtype=np.float64)  
    Q[:, 0] = A[:, 0] / np.linalg.norm(A[:, 0])  
      
    for col in range(1, n):  
        t = A[:, col]  
        inner_products = Q[:, :col].T @ t  
        t -= Q[:, :col] @ inner_products  
        Q[:, col] = t / np.linalg.norm(t)  
      
    return Q  
    def modified_gram_schmidt(A):  
    """Return an orthogonal basis for the columns in A"""  
          
m = A.shape[0]  
    n = A.shape[1]  
      
    Q = np.zeros((m, n), dtype=np.float64)  
    Q[:, 0] = A[:, 0] / np.linalg.norm(A[:, 0])  
      
    for col in range(1, n):  
        t = A[:, col]  
        for index in range(col):  
            t -= Q[:, index] * (np.inner(Q[:, index], t))  
        Q[:, col] = t / np.linalg.norm(t)  
      
    return Q

Describe the difference between the two formulations and convince yourself that they are algebraically equivalent. How can one numerically demonstrate that the modified formulation is more accurate in floating point arithmetic?

Answer

The Gram-Schmidt orthogonalisation first forms all inner products against the previous vectors and then subtracts them. In contrast, the modified Gram-Schmidt method subtracts immediately after forming the inner product with a previous column \(q\). Modified Gram-Schmidt is just a reordering of the orthogonalisation. However, it can be shown that modified Gram-Schmidt is numerically more stable.

In the example below, we orthogonalise a matrix \(A\) with both methods and then compare the resulting matrix \(Q\) with the identity matrix. The relative residual of modified Gram-Schmidt is much closer to machine precision than that of Gram-Schmidt. In practice, therefore, usually modified Gram-Schmidt is used.

m = 1000  
n = 1000  
  
A = np.random.rand(m, n)  
Q_gs = gram_schmidt(A)  
Q_mgs = modified_gram_schmidt(A)  
  
ident = np.eye(n)  
res_gs = np.linalg.norm(Q_gs.T @ Q_gs - ident) / np.linalg.norm(ident)  
res_mgs = np.linalg.norm(Q_mgs.T @ Q_mgs - ident) / np.linalg.norm(ident)  
  
print(f"Residual for Gram-Schmidt: {res_gs}")  
print(f"Residual for modified Gram-Schmidt: {res_mgs}")

This results in the output

Residual for Gram-Schmidt: 4.610285906483318e-11
Residual for modified Gram-Schmidt: 1.1165488100397485e-15

Question

Equip the vector space \(\mathcal{P}_3\) of cubic polynomials

\[ \{ c_3 \, t^3 + c_2 \, t^2 + c_1 \, t + c_0 | \, (c_0, c_1, c_2, c_3) \in \mathbb{R}^4 \} \]

with the scalar product

\[ (p, q ) = \int_a^b p(t) \, q(t) \, w(t) \, dt. \]

Applying the Gram–Schmidt method to the basis \(\{1, t, t^2, t^3\}\), compute orthogonal bases for \(\mathcal{P}_3\) in three cases:

  1. \((a,b) = (-1,1)\) with \(w(t)=1\);

  2. \((a,b) = (0,1)\) with \(w(t)=1\);

  3. \((a,b) = (0,1)\) with \(w(t)=\exp^{-t}\).

Answer

In all cases, let \(\phi_j(t)=t^j\).

  1. Starting with \(\hat{\phi}_0(t)=1\), we compute

    \[\begin{split} \begin{align*} &\hat{\phi}_1(t) = \phi_1(t) - \frac{(\hat{\phi}_0,\phi_1)}{(\hat{\phi}_0,\hat{\phi}_0)} = t - 0,\\ &\hat{\phi}_2(t) = \phi_2(t) - \frac{(\hat{\phi}_1,\phi_2)}{(\hat{\phi}_1,\hat{\phi}_1)}\hat{\phi}_1(t) - \frac{(\hat{\phi}_0,\phi_2)}{(\hat{\phi}_0,\hat{\phi}_0)}\hat{\phi}_0(t) = t^2 - 0 - \frac13,\\ &\hat{\phi}_3(t) = \phi_3(t) - \frac{(\hat{\phi}_2,\phi_3)}{(\hat{\phi}_2,\hat{\phi}_2)}\hat{\phi}_2(t) - \frac{(\hat{\phi}_1,\phi_3)}{(\hat{\phi}_1,\hat{\phi}_1)}\hat{\phi}_1(t) - \frac{(\hat{\phi}_0,\phi_3)}{(\hat{\phi}_0,\hat{\phi}_0)}\hat{\phi}_0(t) = t^3 - 0 - \frac{3t}5 - 0. \end{align*} \end{split}\]

    One could reduce the amount of actual computation by noting that \((f, g)=0\) whenever the product \(f \cdot g\) is odd; using this, \(\hat{\phi}_j\) is odd for odd \(j\) and even for even \(j\).

  2. Starting with \(\hat{\phi}_0(t)=1\), we compute

    \[\begin{split} \begin{align*} \hat{\phi}_1(t) & = \phi_1(t) - \frac{(\hat{\phi}_0,\phi_1)}{(\hat{\phi}_0,\hat{\phi}_0)} = t - \frac12,\\ \hat{\phi}_2(t) & = \phi_2(t) - \frac{(\hat{\phi}_1,\phi_2)}{(\hat{\phi}_1,\hat{\phi}_1)}\hat{\phi}_1(t) - \frac{(\hat{\phi}_0,\phi_2)}{(\hat{\phi}_0,\hat{\phi}_0)}\hat{\phi}_0(t) = t^2 - 1\cdot\hat{\phi}_1(t) - \frac13\cdot\hat{\phi}_0(t) = t^2 - t + \frac16,\\ \hat{\phi}_3(t) & = \phi_3(t) - \frac{(\hat{\phi}_2,\phi_3)}{(\hat{\phi}_2,\hat{\phi}_2)}\hat{\phi}_2(t) - \frac{(\hat{\phi}_1,\phi_3)}{(\hat{\phi}_1,\hat{\phi}_1)}\hat{\phi}_1(t) - \frac{(\hat{\phi}_0,\phi_3)}{(\hat{\phi}_0,\hat{\phi}_0)}\hat{\phi}_0(t)\\ & = t^3 - \frac32\cdot\hat{\phi}_2(t) - \frac9{10}\cdot\hat{\phi}_1(t) - \frac14\cdot\hat{\phi}_0(t) = t^3 - \frac{3t^2}2 + \frac{3t}5 - \frac1{20}. \end{align*} \end{split}\]

    No trick as in (a) is possible here.

  3. Finally, starting with \(\hat{\phi}_0(t)=1\), we compute:

    \[\begin{split} \begin{align*} \hat{\phi}_1(t) & = \phi_1(t) - \frac{(\hat{\phi}_0,\phi_1)}{(\hat{\phi}_0,\hat{\phi}_0)} = t - \frac{\exp-2}{\exp-1} = t - 0.4180232931,\\ \hat{\phi}_2(t) & = \phi_2(t) - \frac{(\hat{\phi}_1,\phi_2)}{(\hat{\phi}_1,\hat{\phi}_1)}\hat{\phi}_1(t) - \frac{(\hat{\phi}_0,\phi_2)}{(\hat{\phi}_0,\hat{\phi}_0)}\hat{\phi}_0(t)\\ & = t^2 - \frac{4\exp^2-13\exp+6}{\exp^2-3\exp+1}\,\hat{\phi}_1(t) - \frac{2\exp-5}{\exp-1}\,\hat{\phi}_0(t) = t^2 - \frac{4\exp^2-13\exp+6}{\exp^2-3\exp+1}\,t + \frac{2\exp^2-8\exp+7}{\exp^2-3\exp+1}\\ & = t^2 - 0.93317985227\,t + 0.1360210355\\ \hat{\phi}_3(t) & = \phi_3(t) - \frac{(\hat{\phi}_2,\phi_3)}{(\hat{\phi}_2,\hat{\phi}_2)}\hat{\phi}_2(t) - \frac{(\hat{\phi}_1,\phi_3)}{(\hat{\phi}_1,\hat{\phi}_1)}\hat{\phi}_1(t) - \frac{(\hat{\phi}_0,\phi_3)}{(\hat{\phi}_0,\hat{\phi}_0)}\hat{\phi}_0(t)\\ & = t^3 - \frac{36\exp^3-194\exp^2+279\exp-48}{4\exp^3-21\exp^2+29\exp-4}\,\hat{\phi}_2(t) - \frac{18\exp^2-61\exp+33}{\exp^2-3\exp+1}\,\hat{\phi}_1(t) - \frac{6\exp-16}{\exp-1}\,\hat{\phi}_0(t)\\ & = t^3 - 1.435696868\,t^2 + 0.5378430946\,t - 0.040296542725. \end{align*} \end{split}\]

Optional material#

QR Decomposition with Householder Transformations

While the Gram-Schmidt process is a common approach to obtain the QR decomposition of a matrix, the Householder transformation method provides a more stable and efficient alternative, especially for large matrices. In this section, we explore the QR decomposition using Householder transformations.

Householder Transformations

Given a vector \(v \in \mathbb{R}^n \setminus \{0\}\), the corresponding Householder transformation \(H\) of order \(n\) is defined as:

\[ H = I - 2 \frac{vv^T}{v^Tv} \]

Clearly,

\[ H x = x - 2 \frac{v^T x}{v^Tv} v \]

Split \(x = x_o + x_c\) into the orthogonal and collinear part: \(v^T x_o = 0\) and \(\alpha v^T = x_c\) for some \(\alpha \in \mathbb{R}\). Then

\[ H x = \left(x_o - 2 \frac{v^T x_o}{v^Tv} v\right) + \left(x_c - 2 \frac{v^T x_c}{v^Tv} v\right) = x_o - x_c, \]

meaning that geometrically, Householder transformation is a reflection of \(x\) in the hyperplane, which is perpendicular to \(v\).

Application in QR Decomposition

As with the Gram-Schmidt method above, the goal is to transform a given matrix \(A\) into an upper triangular matrix \(R\) using orthogonal transformations. This is achieved iteratively, column by column, using Householder matrices.

Step-by-Step Process:

  1. Target Vector: Consider the first column of \(A\). Let us denote this column as \(x\). We want to transform \(x\) into a new vector \(x'\) with the same norm as \(x\), but all components are zero except for the first component.

  2. Constructing \(v\): To create the Householder matrix \(H\), we first need to construct the vector \(v\). This is done by taking \(v = x - \|x\|e_1\), where \(e_1\) is the first unit vector. With \(v\) determined, we form the Householder matrix \(H = I - 2 \frac{vv^T}{v^Tv}\).

  3. Applying \(H\): Multiply \(H\) by \(A\) (i.e., \(HA\)) to reflect the first column of \(A\) about the plane orthogonal to \(v\). This operation zeros out all elements below the diagonal in the first column; see Lecture 10 of [Trefethen and Bau, 1997].

  4. Iterative Process: Repeat the process for the submatrix of \(A\) excluding the already zeroed elements. Each step involves constructing a new Householder matrix to introduce zeros in the subsequent column below the diagonal.

Comparing with the Gram-Schmidt method

  • Numerical Stability: Householder transformations are generally more numerically stable than the classical (and modified) Gram-Schmidt processes.

  • Efficiency in Large Matrices: For large matrices, the increased stability of Householder transformations often outweighs its higher computational complexity.

  • Ease of Understanding: The Gram-Schmidt process is often easier to understand and implement.