The condition number of linear systems

The condition number of linear systems#

The condition number reflects how sensitive a function evaluation \(f(x + \Delta x)\) is to changes in \(\Delta x\). It doesn’t depend on the numerical method \(\tilde{f}\) and shows how hard it is to evaluate \(f\) numerically, no matter which method is used.

With this in mind, we now compute the condition number for linear systems of equations \(Ay = b\) with \(A\in\mathbb{R}^{n\times n}\) and \(b\in\mathbb{R}^n\). We assume that \(A\) is non-singular and \(b \neq 0\). We use \(\|\cdot\|\) to denote both the vector norm on \(\mathbb{R}^n\) and the induced matrix norm on \(\mathbb{R}^{n\times n}\).

We consider the following two problems:

Given a fixed \(A\), what is the local relative condition number \(\kappa_1(b)\) of \(f_1 : \mathbb{R}^n \to \mathbb{R}^n\), defined by \(b \mapsto A^{-1} b\)?
Given a fixed \(b\), what is the local relative condition number \(\kappa_2(A)\) of \(f_2 : \mathbb{R}^{n \times n} \to \mathbb{R}^n\), defined by \(A \mapsto A^{-1} b\)?

Conveniently, both questions can be answered with the same calculation. Let \(\Delta A\) and \(\Delta b\) be perturbations in \(A\) and \(b\), and let \(\Delta y\) satisfy

\[ (A + \Delta A)(y + \Delta y) = b + \Delta b. \]

Then

\[\begin{split} \begin{aligned} \Delta y &= (A + \Delta A)^{-1}(-\Delta A y + \Delta b)\\ &= \left[A(I + A^{-1}\Delta A)\right]^{-1}(-\Delta A y + \Delta b)\\ &= \left(I + A^{-1}\Delta A\right)^{-1}A^{-1}(-\Delta Ay+\Delta b). \end{aligned}. \end{split}\]

To continue, we use the following lemma, whose proof is given in the optional materials. It is a direct generalisation of a well-known result for scalars (i.e. \(n = 1\) and the norm is the modulus) to the matrix setting.

Lemma 1 (Matrix-valued von Neumann series)

Let \(\|\cdot\|\) be a submultiplicative matrix norm, and let \(X\in\mathbb{R}^{n\times n}\) with \(\|X\| < 1\). Then \(I - X\) is invertible and

\[ (I - X)^{-1} = \sum_{i=0}^{\infty}X^i \]

with

\[ \|(I - X)^{-1}\| \leq \frac{1}{1-\|X\|}. \]

Assuming that \(\Delta A\) is sufficiently small so that \(\|A^{-1}\Delta A\| < 1\), we conclude from the lemma that

\[\begin{split} \begin{align*} \frac{\|\Delta y\|}{\|y\|} & \leq \frac{\|A^{-1}\|}{1-\|A^{-1}\|\cdot \|\Delta A\|}\left(\|\Delta A\| + \frac{\|\Delta b\|}{\|y\|}\right)\\ &\leq \frac{\|A^{-1}\|\cdot \|A\|}{1-\|A^{-1}\| \cdot \| \Delta A\|}\left(\frac{\|\Delta A\|}{\|A\|} + \frac{\|\Delta b\|}{\|b\|}\right), \end{align*} \end{split}\]

where in the last step we used that \(\|b\| = \|Ay\| \leq \|A\|\cdot\|y\|\). To answer the first question above, let \(\Delta A = 0\). Then \(A\Delta y = \Delta b\), and thus

\[\begin{split} \begin{align*} \kappa_1(b) = & \lim_{\delta\rightarrow 0} \sup_{b,b+\Delta b \in \mathbb{R}^n \atop 0 < \| \Delta b \| \leq \delta} \frac{\| f_1(b+\Delta b) - f_1(b) \|_{\mathbb{R}^n} / \| f_1(b) \|_{\mathbb{R}^n}}{\| \Delta b \|_{\mathbb{R}^n} / \| b \|_{\mathbb{R}^n}} = \lim_{\delta\rightarrow 0} \sup_{b,\Delta b \in \mathbb{R}^n \atop 0 < \| \Delta b \| \leq \delta} \frac{\| \Delta y \|_{\mathbb{R}^n} / \| y \|_{\mathbb{R}^n}}{\| \Delta b \|_{\mathbb{R}^n} / \| b \|_{\mathbb{R}^n}}\\ \leq & \lim_{\delta\rightarrow 0} \sup_{b,\Delta b \in \mathbb{R}^n \atop 0 < \| \Delta b \| \leq \delta} \frac{\|A^{-1}\|\cdot \|A\|}{1-\|A^{-1}\| \cdot \| \Delta A\|} \leq \|A^{-1}\|\cdot \|A\|. \end{align*} \end{split}\]

Similarly, to answer the second question, let \(\Delta b = 0\). Then \((A + \Delta A)\Delta y = \Delta A y\), and so

\[\begin{split} \begin{align*} \kappa_2(A) = & \lim_{\delta\rightarrow 0} \sup_{A,A+\Delta A \in \mathbb{R}^{n \times n} \atop 0 < \| \Delta A \| \leq \delta} \frac{\| f_2(A+\Delta A) - f_2(A) \|_{\mathbb{R}^{n \times n}} / \| f_2(A) \|_{\mathbb{R}^{n \times n}}}{\| \Delta A \|_{\mathbb{R}^{n \times n}} / \| A \|_{\mathbb{R}^{n \times n}}} \\ = & \lim_{\delta\rightarrow 0} \sup_{A,A+\Delta A \in \mathbb{R}^{n \times n} \atop 0 < \| \Delta A \| \leq \delta} \frac{\| \Delta y \|_{\mathbb{R}^{n \times n}} / \| y \|_{\mathbb{R}^{n \times n}}}{\| \Delta A \|_{\mathbb{R}^{n \times n}} / \| A \|_{\mathbb{R}^{n \times n}}} \leq \|A^{-1}\|\cdot \|A\|. \end{align*} \end{split}\]

Therefore, both condition numbers \(A \mapsto A^{-1}b\) and \(b \mapsto A^{-1}b\) are bounded above by \(\|A^{-1}\|\cdot \|A\|\). This bound is sharp, as recorded by the following fact.

Fact: Condition number of linear systems

Assume that \(A\in\mathbb{R}^{n\times n}\) is non-singular and \(b\in\mathbb{R}^n\) is non-zero. Then

\[ \kappa_1(b) = \kappa_2(A) = \|A^{-1}\|\cdot \|A\|. \]

The condition number \(\kappa_{rel}(A)\) has an important interpretation. It measures how close the linear system is to being singular. More precisely,

\[ \min\left\{\frac{\|\Delta A\|_2}{\|A\|_2}: A + \Delta A \text{ singular} \right\} = \frac{1}{\kappa_{rel}(A)}. \]

We do not prove this result here.

Python skills#

You can compute the condition number of a matrix using the numpy.linalg.cond function. Here are a few examples:

import numpy as np

# Example 1: 2x2 matrix
A = np.array([[1, 1], [0, 0.01]])
cond_number = np.linalg.cond(A)
print(f"Condition number of A: {cond_number}")

# Example 2: 3x3 matrix
B = np.array([[1, 2, 3], [0, 5, 6], [7, 0, 9]])
cond_number_B = np.linalg.cond(B)
print(f"Condition number of B: {cond_number_B}")

# Example 3: Using different norms
C = np.array([[2, 3], [-1, 1]])
cond_number_C_1 = np.linalg.cond(C, 1)  # 1-norm
cond_number_C_inf = np.linalg.cond(C, np.inf)  # Infinity norm
cond_number_C_2 = np.linalg.cond(C, 2)  # 2-norm
print(f"Condition number of C (1-norm): {cond_number_C_1}")
print(f"Condition number of C (Infinity norm): {cond_number_C_inf}")
print(f"Condition number of C (2-norm): {cond_number_C_2}")

Self-check questions#

Question

Compute the \(1-\)norm condition number of

\[\begin{split} A = \begin{pmatrix}1 & 1\\ 0 & \epsilon\end{pmatrix} \end{split}\]

as a function of \(\epsilon\). What happens as \(\epsilon\rightarrow 0\)?

Question (Sharp condition number)

Let

\[\begin{split} A = \left(\begin{array}{rr} 5 &-18\\ -5 &19\end{array}\right). \end{split}\]

Compute local relative condition number \(\kappa_1\) and find vectors \(b\), \(\Delta b\) such that \(A(x+\Delta x)=b+\Delta b\) and

\[ \frac{\|\Delta x\|_1^{}}{\|x\|_1^{}} = \kappa_1 \frac{\|\Delta b\|_1^{}}{\|b\|_1^{}}. \]

Here \(\| \cdot \|_1^{}\) denotes the \(1\)-norm. Normalise your vectors so that \(\|x\|_1^{}=1\) and \(\|\Delta b\|_1^{}=0.01\).

Question (Sharp condition number)

Repeat for

\[\begin{split} A = \left(\begin{array}{rrr} -4 & 8 & -7\\ -5 & 3 & 2\\ -3 & 3 & -9\end{array}\right), \qquad A^{-1} = \left( \begin{array}{rrr} \frac{11}{78} & -\frac{17}{78} & -\frac{37}{234} \\ \frac{17}{78} & -\frac{5}{78} & -\frac{43}{234} \\ \frac{1}{39} & \frac{2}{39} & -\frac{14}{117} \\ \end{array} \right). \end{split}\]

Question

Let

\[\begin{split} A = \begin{pmatrix}2 & 1 \\ 1 & 2\end{pmatrix}. \end{split}\]

Compute \(\kappa_1(A)\), \(\kappa_\infty(A)\), \(\kappa_2(A)\), which denote here the condition numbers with respect to the \(1\)-, \(\infty\)- and \(2\)-norm. Which is largest?

Question

Consider

\[\begin{split} A = \begin{pmatrix}1 & \tfrac{1}{2} \\[6pt] \tfrac{1}{2} & 1\end{pmatrix}. \end{split}\]

Its eigenvalues are \(\lambda_{\max}=1.5\) and \(\lambda_{\min}=0.5\). Suppose we add a small perturbation \(\Delta A\) and consider no perturbation in \(b\) (i.e. \(\Delta b=0\)). Estimate, using the \(2\)-norm condition number, how large the relative perturbation \(\frac{\|\Delta A\|_2}{\|A\|_2}\) can be before \(A+\Delta A\) becomes close to singular.

The condition number of linear systems

Contents

The condition number of linear systems#

Python skills#

Self-check questions#

Optional material#