Backward error and pivoting#
Recall that we can bound the forward error from the condition number and the backward error. The condition number is a property of the linear system, not of the numerical method we employ (such as the LU decomposition), and we have established that \(\kappa(A) = | A | \cdot | A^{-1} |\).
The backward error of the LU decomposition remains a subject of significant interest. Despite the long-established nature of LU decomposition, several open questions persist regarding its backward error.
Fundamentally, LU decomposition is not backward stable, and we will examine specific examples illustrating this instability. In some cases, LU decomposition can lead to a catastrophic increase in backward error.
To address this issue, a widely used modification known as pivoting is introduced. This refinement substantially improves the stability of the algorithm in almost all cases. However, it is still possible to construct matrices for which the pivoted algorithm remains unstable. Such cases, however, are highly pathological and rarely arise in practical applications.
Backward error of the LU decomposition#
Let \(A = LU\) be the LU decomposition of a nonsingular matrix \(A\in\mathbb{R}^{n\times n}\), and let \(\tilde{L}, \tilde{U}\) be the computed LU factors obtained via Gaussian elimination in floating point arithmetic. These factors satisfy
Recall the relative backward error, given \(A\), is
One can show for \(\Delta A = A - \tilde{L} \, \tilde{U}\) that
One might naively conclude that this guarantees the backward stability of LU decomposition. However, backward stability requires \(\| \Delta A \| / \|A\| = \mathcal{O}(\epsilon_{mach})\) whereas in our bound, \(\|L\|\cdot \|U\|\) appears in the denominator.
So the question of backward stability of the \(LU\) decomposition is reduced to the problem of when we have that
Example 8 (Failure of backward stability)
Let
We have
Consider the \(1\)-norm. In this example \(\|A\| = 2\) but \(\|L\|\cdot \|U\|\approx \mathcal{O}(\epsilon^{-2})\).
The norms of \(L\) and \(U\) can differ wildly from the norm of \(A\). But there is an easy fix in this example. We can swap the first and second rows of \(A\) to obtain
We now have \(\|L\|\cdot \|U\|\approx 1\).
LU decomposition with pivoting#
The above examples motivates a modification of the LU decomposition, called column-pivoted (or simply pivoted) LU. The idea is simple. Suppose that \(a_{31}\) is the element with the largest magnitude in the first column of \(A \in \mathbb{R}^{3 \times 3}\). We then exchange the third and the first row to get
Eliminating in the first column of \(A^{(1)}\) as usual, we obtain a matrix
Before proceeding with the LU decomposition, we swap the second and third rows if the element at position \((3, 2)\) is larger by magnitude than the entry at position at \((2, 2)\), using
giving
This pivoting strategy guarantees that \(|\ell_{i, j}|\leq 1\) for all \(i, j\).
Definition 6 (Permutation matrix)
\(P \in \mathbb{R}^{n \times n}\) is a permutation matrix if there exists a bijective mapping \(\pi : \{1, \ldots, n\} \to \{1, \ldots, n\}\) (called permutation) such that
where \(e^{(k)} \in \mathbb{R}^n\) is the \(k\)th canonical basis vector.
In other words, a permutation matrix is obtained by re-arranging the rows of the identity matrix. As shown in a self-check question below, reordering rows is precisely the effect of multiplication by a permutation matrix from the left.
Fact: Pivoted LU decomposition
Let \(A \in \mathbb{R}^{n \times n}\). Then there are a lower triangular \(L\), upper triangular \(U\) and permutation matrix \(P\) in \(\mathbb{R}^{n \times n}\) such that
The vector \(x \in \mathbb{R}^n\) solves \(A x = b\) if and only if \(x\) solves \(U x = y\) with \(L y = P b\).
Proof. Generalising the principle of the above example gives
There is a commutation property between permutation and Frobenius matrices,
for elementary permutations \(P\), that ultimately allows to reorder the product so that
where the \(\hat{F}_i = I - \hat{f}_i e^{(i)}\) are Frobenius matrices of index \(i\), \(P\) is a permutation matrix and \(U\) is upper triangular. The details are beyond the scope of this lecture. See [Plato, 2003] for the full proof.
Growth factors#
Let us consider how pivoting influences the size of the factors in the LU decomposition. It is convenient to initially work with the norm \(\| A \|_* := \max_{ij} | A_{ij} |\).
Pivoting ensures that \(|\ell_{i, j}| \leq 1\) with \(\ell_{i,i} = 1\). Hence, \(\|L\|_* = 1\) and \(\|L\|_* \cdot \|U\|_* = \|U\|_*\). To measure how \(\|U\|_*\) grows the following definition of the growth factor \(\rho\) is useful.
Hence, we have that
Now let \(\| \cdot \|\) be our preferred matrix norm:
Then
It follows that the LU decomposition with pivoting is backward stable if \(\rho\) is small.
Counterexample to backward stability#
For most matrices, the growth factor \(\rho\) remains small when pivoting is employed. However, one counterexample suffices to demonstrate that LU decomposition with pivoting is not backward stable.
Example 9 (Counterexample to backward stability)
Define \(A_n \in \mathbb{R}^{n \times n}\) as the matrix with ones on the diagonal and in the last column, and \(-1\) on all entries below the diagonal. For instance, \(A_8\) is:
Since the diagonal and subdiagonal entries have modulus one, pivoting does not cause row interchanges. Let \(A_n = L_n U_n\). It can be shown that \(U_n\) is the identity matrix except in its last column, whose \(i\)th entry is \(2^{i-1}\):
Hence, the growth factor is \(\rho(A) = 2^{n-1}\), which increases exponentially with \(n\). In double-precision arithmetic (\(\epsilon_{\mathrm{mach}} = 2^{-52}\)), taking \(n = 53\) leads to a backward error of order one. For \(n = 100\), the magnitude of the backward error is on the order of
A system of dimension \(100\) is modest by modern computational standards, yet this example illustrates that pivoted LU decomposition can fail spectacularly in certain pathological cases.
Though rarely encountered in practice, such counterexamples highlight the importance of seeking alternative methods when numerical stability is critical.
Python skills#
Below are some Python examples that illustrate how to compute and investigate the backward error in LU decomposition.
We also wish to compare LU decomposition with and without pivoting. Since the Python library provides the decomposition with pivoting, we implement here only the LU decomposition without pivoting.
import numpy as np
def lu_no_pivot(A):
"""
Performs LU decomposition without pivoting.
Returns (L, U) where A = L @ U.
NOTE: This function may fail or produce large errors if A has zeros or very small pivots on the main diagonal.
"""
A = A.astype(float)
n = A.shape[0]
L = np.eye(n)
U = A.copy()
for k in range(n - 1):
# If U[k, k] is zero (or extremely small), the decomposition will fail numerically
if abs(U[k, k]) < 1e-15:
raise ValueError(f"Zero (or tiny) pivot encountered at index {k}. "
"Pivoting would be required here.")
for i in range(k + 1, n):
L[i, k] = U[i, k] / U[k, k]
U[i, k:] = U[i, k:] - L[i, k] * U[k, k:]
return L, U
We begin the comparison by computing the reconstruction errors. Then recall the definition of the relative backward error for the computed factors \(\tilde{L}\) and \(\tilde{U}\):
from scipy.linalg import lu
# Example usage:
A = np.array([[1e-10, 1.3],
[1.1, 1.2]], dtype=float)
P, L, U = lu(A)
L_no_pivot, U_no_pivot = lu_no_pivot(A)
print("Matrix A:")
print(A)
print("\nPermutation matrix P:")
print(P)
print("\nLower triangular matrix L:")
print(L)
print("\nUpper triangular matrix U:")
print(U)
print("\nLower triangular matrix without pivoting L:")
print(L_no_pivot)
print("\nUpper triangular matrix without pivoting U:")
print(U_no_pivot)
# Check that P @ A is approximately L @ U
reconstruction_error = np.linalg.norm(P @ A - L @ U)
reconstruction_error_no_pivot = np.linalg.norm(A - L_no_pivot @ U_no_pivot)
norm_A = np.linalg.norm(A, ord=2)
print(f"\nReconstruction error with pivoting (|| PA - LU||): {reconstruction_error:e}")
print(f"\nReconstruction error without pivoting (|| A - LU||): {reconstruction_error_no_pivot:e}")
print(f"Relative backward error with pivoting: (|| PA - LU|| / ||A||): {reconstruction_error / norm_A:e}")
print(f"Relative backward error without pivoting: (|| A - LU|| / ||A||): {reconstruction_error_no_pivot / norm_A:e}")
Finally, we investigate the counterexample for backward stability.
def build_counterexample_matrix(n):
"""
Constructs the n x n counterexample matrix A described in the notes:
1s on the diagonal and last column, and -1 below the diagonal.
"""
A = np.zeros((n, n), dtype=float)
for i in range(n):
A[i, i] = 1.0 # diagonal entries
A[i, -1] = 1.0 # last column entries
for j in range(i):
A[i, j] = -1.0
return A
def test_counterexample(n):
A = build_counterexample_matrix(n)
P, L, U = lu(A)
num = np.linalg.norm(P @ A - L @ U)
den = np.linalg.norm(A)
err = num / den
growth_factor = np.linalg.norm(U, ord=1) # Illustrative choice for growth factor
print(f"n = {n} | Relative backward error: {err:e} | Growth factor (||U||_1): {growth_factor:e}")
# Example usage:
for size in [10, 20, 30, 40, 50, 53, 100, 1000]:
test_counterexample(size)
Self-check exercises#
Question
Solve the linear systems
using Gaussian elimination with pivoting and back substitution.
Answer
Backsubstitution gives \(x=(-9/2,7/4,-4/3,1)\).
whose solution is \(x=(67,47,258)\).
Question
Let \(P\) be a permutation matrix with permutation \(\pi\).
(a) Show that
(b) Let \(A\), \(B\) be matrices of compatible size so that the matrix product \(A \cdot B\) is defined. Show that
where \(a_k\) is the \(k\)th column of \(A\) and \((b_k)^\top\) the \(k\)th row of \(B\).
(c) Show for the vectors \(b_1, \ldots, b_n \in \mathbb{R}^n\) that
Answer
(a) For all \(k = 1, \ldots, n\), we find \(P e^{(k)} = e^{(\pi(k))}\). Similarly
We now use that two matrices are the same if multiplication with the basis vectors \(e^{(k)}\) yields the same product.
(b) Let \(R_k\) be the matrix with the entries \((R_k)_{ij} = A_{ik} B_{kj}\). Observe that \(R_k\) is the outer product of the \(k\)th column of \(A\) and \(k\)th row of \(B\):
Entry \((i,j)\) of the matrix product \(A \cdot B\) is
This means that
(c) For \((*)\) use (b) with \(P = A\). Using for \((**)\) that \(\pi(k) = \ell\) visits every number \(1, \ldots, n\) exactly once as \(j = 1, \ldots, n\), we find
For \((***)\) we used part (b) with \(A = I\).
Question
As in the counter example above, define \(A_n \in \mathbb{R}^{n \times n}\) as the matrix with ones on the diagonal and in the last column, and \(-1\) on all entries below the diagonal.
Show that the LU decomposition \(A_n = L_n U_n\) exists where \(U_n\) is the identity matrix except in its last column, whose \(i\)th entry is \(2^{i-1}\).
Answer
Let \(B_n(\alpha) \in \mathbb{R}^{n \times n}\) be the matrix with ones on the diagonal and \(\alpha\) in the last column, and \(-1\) on all entries below the diagonal: For example
and \(B_n(1) = A_n\). Let
Elimination of the entries of the first column of \(B_n(\alpha)\) has no effect on the entries of columns \(2, \ldots, n-1\) because the respective entries of the first row vanish. In the last column, all entries except the first double in size. Thus we obtain the following matrix (in matrix block form):
Repeating the step a further \(n-2\) times with
in place of \(B_n(\alpha)\) shows the result, noting that the last iteration returns a triangular \(2 \times 2\) matrix with the entry \(2^{n-1}\) in the lower right.
Question
As discussed in the computer practical on Frobenius matrices, suppose you have implemented the following three functions in Python:
permute(i,j,n)
returns ann
\(\times\)n
permutation matrix that interchanges thei
th andj
th elements of a vector upon multiplication.frob(s,f)
returns the \(n \times n\) Frobenius matrix of indexs
whose subdiagonal entries are given by last \(n - \)s
entries off
. Here \(n\) is the length off
.ifrob(s,f)
returns the inverse offrob(s,f)
.
Find the LU decomposition with pivoting of A
, using the above Python functions for Frobenius and permutation matrices:
A = np.array([[3, 2, 1], [6, 4, 0], [0, 2, -4]])
Answer
The following commands define U
, P
and L
for the decomposition P @ A = L @ U
:
U = permute(1,2,3) @ frob(0,[0.0, 0.5, 0.0]) @ permute(0,1,3) @ A
P = permute(1,2,3) @ permute(0,1,3)
L = ifrob(0,permute(1,2,3) @ [0.0, 0.5, 0.0])
print(f"P A = \n{P @ A}\n")
print(f"L U = \n{L @ U}\n")
print(f"P = \n{P}\n")
print(f"L = \n{L}\n")
print(f"U = \n{U}")
Note: Some books and Python libraries instead use the decomposition A = P @ L @ U
. The P
of this decomposition is the transpose of the P
above.