The condition number of linear systems

The condition number of linear systems#

The condition number reflects how sensitive a function evaluation f(x+Δx) is to changes in Δx. It doesn’t depend on the numerical method f~ and shows how hard it is to evaluate f numerically, no matter which method is used.

With this in mind, we now compute the condition number for linear systems of equations Ay=b with ARn×n and bRn. We assume that A is non-singular and b0. We use to denote both the vector norm on Rn and the induced matrix norm on Rn×n.

We consider the following two problems:

  1. Given a fixed A, what is the local relative condition number κ1(b) of f1:RnRn, defined by bA1b?

  2. Given a fixed b, what is the local relative condition number κ2(A) of f2:Rn×nRn, defined by AA1b?

Conveniently, both questions can be answered with the same calculation. Let ΔA and Δb be perturbations in A and b, and let Δy satisfy

(A+ΔA)(y+Δy)=b+Δb.

Then

Δy=(A+ΔA)1(ΔAy+Δb)=[A(I+A1ΔA)]1(ΔAy+Δb)=(I+A1ΔA)1A1(ΔAy+Δb)..

To continue, we use the following lemma, whose proof is given in the optional materials. It is a direct generalisation of a well-known result for scalars (i.e. n=1 and the norm is the modulus) to the matrix setting.

Lemma 1 (Matrix-valued von Neumann series)

Let be a submultiplicative matrix norm, and let XRn×n with X<1. Then IX is invertible and

(IX)1=i=0Xi

with

(IX)111X.

Assuming that ΔA is sufficiently small so that A1ΔA<1, we conclude from the lemma that

ΔyyA11A1ΔA(ΔA+Δby)A1A1A1ΔA(ΔAA+Δbb),

where in the last step we used that b=AyAy. To answer the first question above, let ΔA=0. Then AΔy=Δb, and thus

κ1(b)=limδ0supb,b+ΔbRn0<Δbδf1(b+Δb)f1(b)Rn/f1(b)RnΔbRn/bRn=limδ0supb,ΔbRn0<ΔbδΔyRn/yRnΔbRn/bRnlimδ0supb,ΔbRn0<ΔbδA1A1A1ΔAA1A.

Similarly, to answer the second question, let Δb=0. Then (A+ΔA)Δy=ΔAy, and so

κ2(A)=limδ0supA,A+ΔARn×n0<ΔAδf2(A+ΔA)f2(A)Rn×n/f2(A)Rn×nΔARn×n/ARn×n=limδ0supA,A+ΔARn×n0<ΔAδΔyRn×n/yRn×nΔARn×n/ARn×nA1A.

Therefore, both condition numbers AA1b and bA1b are bounded above by A1A. This bound is sharp, as recorded by the following fact.

Fact: Condition number of linear systems

Assume that ARn×n is non-singular and bRn is non-zero. Then

κ1(b)=κ2(A)=A1A.

The condition number κrel(A) has an important interpretation. It measures how close the linear system is to being singular. More precisely,

min{ΔA2A2:A+ΔA singular}=1κrel(A).

We do not prove this result here.

Python skills#

You can compute the condition number of a matrix using the numpy.linalg.cond function. Here are a few examples:

import numpy as np

# Example 1: 2x2 matrix
A = np.array([[1, 1], [0, 0.01]])
cond_number = np.linalg.cond(A)
print(f"Condition number of A: {cond_number}")

# Example 2: 3x3 matrix
B = np.array([[1, 2, 3], [0, 5, 6], [7, 0, 9]])
cond_number_B = np.linalg.cond(B)
print(f"Condition number of B: {cond_number_B}")

# Example 3: Using different norms
C = np.array([[2, 3], [-1, 1]])
cond_number_C_1 = np.linalg.cond(C, 1)  # 1-norm
cond_number_C_inf = np.linalg.cond(C, np.inf)  # Infinity norm
cond_number_C_2 = np.linalg.cond(C, 2)  # 2-norm
print(f"Condition number of C (1-norm): {cond_number_C_1}")
print(f"Condition number of C (Infinity norm): {cond_number_C_inf}")
print(f"Condition number of C (2-norm): {cond_number_C_2}")

Self-check questions#

Question

Compute the 1norm condition number of

A=(110ϵ)

as a function of ϵ. What happens as ϵ0?

Answer

We have

A1=(1ϵ10ϵ1).

Hence,

κ(A)=A1A11=(1+ϵ)2ϵ.

and κ(A) as ϵ0. The reason is simple: For ϵ=0 the matrix is not invertible and we expect the condition number to become unbounded as we reach this limit case.

Question (Sharp condition number)

Let

A=(518519).

Compute local relative condition number κ1 and find vectors b, Δb such that A(x+Δx)=b+Δb and

Δx1x1=κ1Δb1b1.

Here 1 denotes the 1-norm. Normalise your vectors so that x1=1 and Δb1=0.01.

Answer

We have A1=Acol=37 and

A1=(19518511)A11=A1col=245andκ1=17735.

To satisfy b1=A1x1, we could take x=(0,1) which gives b=(18,19). And to satisfy Δx1=A11Δb1, we could take Δb=(0.01,0). By construction,

Δx1x1=κ1Δb1b1.

Question (Sharp condition number)

Repeat for

A=(487532339),A1=(117817783723417785784323413923914117).
Answer

We have A1=Acol=18 and A11=108234=613, so κ1(A)=108/13. In the derivation of the error formula, there are only two inequalities: Δx1A11Δb1 and b1A1x1. We therefore choose x such that b1=A1x1, namely x=(0,0,1) which gives b=(7,2,9), and Δb such that Δx1=A11Δb1, e.g. Δb=(0,0,0.01).

Question

Let

A=(2112).

Compute κ1(A), κ(A), κ2(A), which denote here the condition numbers with respect to the 1-, - and 2-norm. Which is largest?

Answer

First compute A1:

A1=13(2112).

For the 1-norm and -norm:

A1=A=maxrow or col(2+1)=3.

Similarly,

A11=A1=1.

Thus,

κ1(A)=κ(A)=3.

For the 2-norm, the eigenvalues of A are 3 and 1, so

A2=3,A12=11=1,

hence

κ2(A)=3.

All three condition numbers coincide and equal 3 in this case.

Question

Consider

A=(112121).

Its eigenvalues are λmax=1.5 and λmin=0.5. Suppose we add a small perturbation ΔA and consider no perturbation in b (i.e. Δb=0). Estimate, using the 2-norm condition number, how large the relative perturbation ΔA2A2 can be before A+ΔA becomes close to singular.

Answer

Since A2=1.5 and A12=1/λmin=2, the condition number is

κ2(A)=A2A12=1.52=3.

Recall

min{ΔA2A2:A+ΔA singular}=1κ2(A)=13.

Hence, a relative perturbation of about 1/3 in norm is sufficient for A+ΔA to become close to singular. Since A2=1.5, this corresponds to ΔA2=0.5.

Optional material#

Proof of the convergence lemma for the matrix-valued von Neumann series

Here is the proof of the convergence lemma for matrix-valued von Neumann series.

Proof. Let Sn=i=0nXi. By norm equivalence and submultiplicativity we have for each matrix element (Sn),t that

|(Sn),t|i=0n|(Xi),t|i=0nmax,t|(Xi),t|this is a normCi=0XiCi=0Xi=C1X.

for some C>0. Hence, every component of Sn is absolutely convergent and therefore the sum Sn converges with Xn0 as n. We conclude the S:=i=0Xi exists.

We find (IX)Sn=i=0nXii=1n+1Xi=IXn+1, cancelling common terms in the sums. Taking the limit as n we have (IX)S=I,so (IX) is non-singular with (IX)1=S. Finally,

(IX)1=Si=0Xi=11X.