The condition number of linear systems#
The condition number reflects how sensitive a function evaluation \(f(x + \Delta x)\) is to changes in \(\Delta x\). It doesn’t depend on the numerical method \(\tilde{f}\) and shows how hard it is to evaluate \(f\) numerically, no matter which method is used.
With this in mind, we now compute the condition number for linear systems of equations \(Ay = b\) with \(A\in\mathbb{R}^{n\times n}\) and \(b\in\mathbb{R}^n\). We assume that \(A\) is non-singular and \(b \neq 0\). We use \(\|\cdot\|\) to denote both the vector norm on \(\mathbb{R}^n\) and the induced matrix norm on \(\mathbb{R}^{n\times n}\).
We consider the following two problems:
Given a fixed \(A\), what is the local relative condition number \(\kappa_1(b)\) of \(f_1 : \mathbb{R}^n \to \mathbb{R}^n\), defined by \(b \mapsto A^{-1} b\)?
Given a fixed \(b\), what is the local relative condition number \(\kappa_2(A)\) of \(f_2 : \mathbb{R}^{n \times n} \to \mathbb{R}^n\), defined by \(A \mapsto A^{-1} b\)?
Conveniently, both questions can be answered with the same calculation. Let \(\Delta A\) and \(\Delta b\) be perturbations in \(A\) and \(b\), and let \(\Delta y\) satisfy
Then
To continue, we use the following lemma, whose proof is given in the optional materials. It is a direct generalisation of a well-known result for scalars (i.e. \(n = 1\) and the norm is the modulus) to the matrix setting.
Lemma 1 (Matrix-valued von Neumann series)
Let \(\|\cdot\|\) be a submultiplicative matrix norm, and let \(X\in\mathbb{R}^{n\times n}\) with \(\|X\| < 1\). Then \(I - X\) is invertible and
with
Assuming that \(\Delta A\) is sufficiently small so that \(\|A^{-1}\Delta A\| < 1\), we conclude from the lemma that
where in the last step we used that \(\|b\| = \|Ay\| \leq \|A\|\cdot\|y\|\). To answer the first question above, let \(\Delta A = 0\). Then \(A\Delta y = \Delta b\), and thus
Similarly, to answer the second question, let \(\Delta b = 0\). Then \((A + \Delta A)\Delta y = \Delta A y\), and so
Therefore, both condition numbers \(A \mapsto A^{-1}b\) and \(b \mapsto A^{-1}b\) are bounded above by \(\|A^{-1}\|\cdot \|A\|\). This bound is sharp, as recorded by the following fact.
Fact: Condition number of linear systems
Assume that \(A\in\mathbb{R}^{n\times n}\) is non-singular and \(b\in\mathbb{R}^n\) is non-zero. Then
The condition number \(\kappa_{rel}(A)\) has an important interpretation. It measures how close the linear system is to being singular. More precisely,
We do not prove this result here.
Python skills#
You can compute the condition number of a matrix using the numpy.linalg.cond
function. Here are a few examples:
import numpy as np
# Example 1: 2x2 matrix
A = np.array([[1, 1], [0, 0.01]])
cond_number = np.linalg.cond(A)
print(f"Condition number of A: {cond_number}")
# Example 2: 3x3 matrix
B = np.array([[1, 2, 3], [0, 5, 6], [7, 0, 9]])
cond_number_B = np.linalg.cond(B)
print(f"Condition number of B: {cond_number_B}")
# Example 3: Using different norms
C = np.array([[2, 3], [-1, 1]])
cond_number_C_1 = np.linalg.cond(C, 1) # 1-norm
cond_number_C_inf = np.linalg.cond(C, np.inf) # Infinity norm
cond_number_C_2 = np.linalg.cond(C, 2) # 2-norm
print(f"Condition number of C (1-norm): {cond_number_C_1}")
print(f"Condition number of C (Infinity norm): {cond_number_C_inf}")
print(f"Condition number of C (2-norm): {cond_number_C_2}")
Self-check questions#
Question
Compute the \(1-\)norm condition number of
as a function of \(\epsilon\). What happens as \(\epsilon\rightarrow 0\)?
Answer
We have
Hence,
and \(\kappa(A)\rightarrow\infty\) as \(\epsilon\rightarrow 0\). The reason is simple: For \(\epsilon=0\) the matrix is not invertible and we expect the condition number to become unbounded as we reach this limit case.
Question (Sharp condition number)
Let
Compute local relative condition number \(\kappa_1\) and find vectors \(b\), \(\Delta b\) such that \(A(x+\Delta x)=b+\Delta b\) and
Here \(\| \cdot \|_1^{}\) denotes the \(1\)-norm. Normalise your vectors so that \(\|x\|_1^{}=1\) and \(\|\Delta b\|_1^{}=0.01\).
Answer
We have \(\|A\|_1^{}=\|A\|_\textrm{col}^{}=37\) and
To satisfy \(\|b\|_1^{}=\|A\|_1^{}\|x\|_1^{}\), we could take \(x=(0,1)^\top\) which gives \(b=(-18,19)^\top\). And to satisfy \(\| \Delta x\|_1^{}=\|A^{-1}\|_1^{}\|\Delta b\|_1^{}\), we could take \(\Delta b=(0.01,0)^\top\). By construction,
Question (Sharp condition number)
Repeat for
Answer
We have \(\|A\|_1^{}=\|A\|_\textrm{col}=18\) and \(\|A^{-1}\|_1^{} = \frac{108}{234} = \frac6{13}\), so \(\kappa_1(A) = 108/13\). In the derivation of the error formula, there are only two inequalities: \(\|\Delta x\|_1^{}\le\|A^{-1}\|_1^{}\|\Delta b\|_1^{}\) and \(\|b\|_1^{}\le \|A\|_1^{}\,\|x\|_1^{}\). We therefore choose \(x\) such that \(\|b\|_1^{}=\|A\|_1^{}\,\|x\|_1^{}\), namely \(x=(0,0,1)^\top\) which gives \(b=(-7,2,-9)^\top\), and \(\Delta b\) such that \(\|\Delta x\|_1^{}=\|A^{-1}\|_1^{}\|\Delta b\|_1^{}\), e.g. \(\Delta b=(0,0,0.01)^\top\).
Question
Let
Compute \(\kappa_1(A)\), \(\kappa_\infty(A)\), \(\kappa_2(A)\), which denote here the condition numbers with respect to the \(1\)-, \(\infty\)- and \(2\)-norm. Which is largest?
Answer
First compute \(A^{-1}\):
For the \(1\)-norm and \(\infty\)-norm:
Similarly,
Thus,
For the \(2\)-norm, the eigenvalues of \(A\) are \(3\) and \(1\), so
hence
All three condition numbers coincide and equal \(3\) in this case.
Question
Consider
Its eigenvalues are \(\lambda_{\max}=1.5\) and \(\lambda_{\min}=0.5\). Suppose we add a small perturbation \(\Delta A\) and consider no perturbation in \(b\) (i.e. \(\Delta b=0\)). Estimate, using the \(2\)-norm condition number, how large the relative perturbation \(\frac{\|\Delta A\|_2}{\|A\|_2}\) can be before \(A+\Delta A\) becomes close to singular.
Answer
Since \(\|A\|_2 = 1.5\) and \(\|A^{-1}\|_2 = 1/\lambda_{\min} = 2\), the condition number is
Recall
Hence, a relative perturbation of about \(1/3\) in norm is sufficient for \(A+\Delta A\) to become close to singular. Since \(\|A\|_2=1.5\), this corresponds to \(\|\Delta A\|_2 = 0.5\).
Optional material#
Proof of the convergence lemma for the matrix-valued von Neumann series
Here is the proof of the convergence lemma for matrix-valued von Neumann series.
Proof. Let \(S_n = \sum_{i=0}^n X^i\). By norm equivalence and submultiplicativity we have for each matrix element \((S_n)_{\ell, t}\) that
for some \(C> 0\). Hence, every component of \(S_n\) is absolutely convergent and therefore the sum \(S_n\) converges with \(X^{n}\rightarrow 0\) as \(n\rightarrow\infty\). We conclude the \(S := \sum_{i=0}^{\infty}X^{i}\) exists.
We find \((I - X)S_n = \sum_{i = 0}^n X^i - \sum_{i = 1}^{n + 1} X^i = I - X^{n+1}\), cancelling common terms in the sums. Taking the limit as \(n \to \infty\) we have \((I-X)S = I\),so \((I-X)\) is non-singular with \((I-X)^{-1} = S\). Finally,