Matrix and vector norms

Matrix and vector norms#

Norm axioms#

Let \(V\) be a real vector space, \(\mathbb{R}^{+} := [0, \infty)\), and \(\rho: V\rightarrow \mathbb{R}^{+}\) a mapping from \(V\) into the nonnegative numbers. \(\rho\) is called a norm if it satisfies the following properties

\(\rho(\alpha x) = |\alpha| \rho(x)~\forall \alpha\in\mathbb{R};~x\in V\)
\(\rho(x + y) \leq \rho(x) + \rho(y)~\forall x, y\in V\) (triangle inequality)
\(\rho(x) = 0\) if and only if \(x = 0\).

From these axioms, it immediately follows that

\[ 0 = \rho(0) = \rho(x - x) \leq 2\rho(x) \]

and therefore \(\rho(x) \geq 0\) for all \(x\in V\). Hence, a norm is always nonnegative.

The first and third norm axioms are usually easy to prove. The difficult one is the second one, the triangle inequality.

Frequently used vector norms#

Let \(V\) be the real vector space \(\mathbb{R}^n\). Let \(x\in V\) and denote by \(x_j\) the \(j\)-th component of \(x\).

The most frequently used vector norms are:

The \(1\)-norm \(\|x\|_1 = \sum_{j=1}^n|x_j|\).
The \(2\)-norm \(\|x\|_2 = \left[\sum_{j=1}^n|x_j|^2\right]^{1/2}\).
The \(\infty\)-norm \(\|x\|_{\infty} = \max_j |x_j|\).

These are all special cases of the \(\ell_p\)-norm defined as

\[ \|x\|_p = \left[\sum_{j=1}^n]|x_j|^p\right]^{1/p} \]

in the sense that \(\|x\|_{\infty} = \lim_{p \to \infty} \|x\|_p\) for all \(x\).

Norm equivalence#

Which norm should we use? This usually depends on the application. A New York taxi driver will prefer the \(1\)-norm when calculating distances as they need to follow a street map and can’t go in straight lines to locations.

In Numerical Analysis, we often want to prove whether a given numerical approximation converges. We consider several computations indexed by \(n\) and have a corresponding error quantity \(e_n\). We want to show that the sequence of errors \(\left\{e_n\right\}\) converges to \(0\) as \(n \to \infty\). Can it happen that an error converges to \(0\) in one norm but not in another norm? In finite dimensional vector spaces, the answer is no. If a sequence converges in one norm, it also converges in all others. This goes back to the following norm equivalence theorem.

Norm equivalence in finite dimensional vector spaces#

Let \(V\) be a finite-dimensional vector space. Consider two norms \(\|\cdot\|_a\) and \(\|\cdot\|_b\) on \(V\). Then, there exist positive constants \(C_1\), \(C_2\) such that for all \(x\in V\)

\[ C_1\|x\|_a\leq \|x\|_b\leq C_2\|x\|_a \]

Proof: The proof requires some analysis and is not part of the lecture. A good reference for those interested in how it works is the following note by Steve Johnson at MIT.

We note that norm equivalence is only a property of finite dimensional vector spaces. For infinite dimensional spaces ( e.g., spaces of functions), it is easy to construct examples of functions converging with respect to one norm but not another.

Matrix norms#

Let \(A\in\mathbb{R}^{m\times n}\). Consider vector norms \(\|x\|_a\) for \(x\in\mathbb{R}^n\) and \(\|y\|_b\) for \(y\in\mathbb{R}^{m}\). Then, an induced matrix norm can be defined as follows:

\[ \|A\|_{b, a}:= \max_{x\in\mathbb{R}^n\backslash\{0\}}\frac{\|Ax\|_b}{\|x\|_a} \]

If \(a = b\) then we abbreviate \(\|A\|_a := \|A\|_{a, a}\).

The following three matrix norms derive from the corresponding vector norms.

\(\|A\|_1 :=\max_{1\leq j\leq n} \sum_{i=1}^n|a_{ij}|\)
\(\|A\|_2 :=\sqrt{\lambda_{\max}(A^TA)}\), where \(\lambda_\max\) denotes the largest eigenvalue.
\(\|A\|_{\infty} := \max_{1\leq i\leq n}\sum_{j=1}^n|a_{ij}\)

The \(\|A\|_2\) matrix norm is difficult to evaluate as it requires computation of the largest eigenvalue of \(A^TA\), which is an expensive operation. In practice, the following Frobenius norm is therefore frequently used instead of the matrix \(2\)-norm:

\[ \|A\|_F:=\left(\sum_{ij}|a_{ij}|^2\right)^{1/2}. \]

We will later show that

\[ \|A\|_2\leq \|A\|_F\leq\sqrt{r}\|A\|_2 \]

Submultiplicativity#

Let \(A\in\mathbb{R}^{m\times n}\) and \(B\in\mathbb{R}^{n\times k}\) with corresponding vector-induced matrix norm \(\|\cdot\|\). For any \(v\in\mathbb{R}^k\) it holds that

\[ \|ABv\|\leq \|A\|\cdot\|Bv\|\leq \|A\|\cdot \|B\| \cdot \|v\|. \]

and

\[ \|A\cdot B\|\leq \|A\|\cdot \|B\|. \]

This property is called submultiplicatity. We have just proved it for any vector-induced matrix norm. But what about the Frobenius norm?

First of all, since \(\|A\|_2\leq \|A\|_F\) we have that

\[ \|Ax\|_2 \leq \|A\|_2\|x\|_2 \leq \|A\|_F\|x\|_2. \]

We can now prove the following statement.

Submultiplicativity of the Frobenius norm#

For matrices \(A\) and \(B\) with compatible dimensions such that the product \(AB\) exists it holds that

\[ \|AB\|_F\leq \|A\|_F\cdot \|B\|_F. \]

Proof: We denote by \(B_{:, j}\) the \(j\)-th colon of the matrix \(B\). We can now write

\[ \|AB\|_F^2 = \sum_{j}\|AB_{:, j}\|_2^2\leq \|A\|_F^2\sum_{j}\|B_{:, j}\|_2^2 = \|A\|_F^2\cdot \|B\|_F^2, \]

from which the result follows.