Basic properties of eigenvalue problems#

Eigenvalues are often complex, making it essential to consider eigenvalue problems within the setting of complex vector spaces.

Definition 5

Consider a matrix \(A\) in \(\mathbb{C}^{n \times n}\). An eigenvalue problem is the task to find a nonzero vector \(x \in \mathbb{C}^n\), called eigenvector, and a scalar \(\lambda \in \mathbb{C}\), called eigenvalue, such that:

\[ A x = \lambda x. \]

The following section briefly outlines key distinctions in handling complex spaces.

Complex vector spaces#

For complex spaces, the Euclidean inner product is defined as:

\[ \langle x, y \rangle = x^T \, \overline{y} = \sum_j x_j \, \overline{y}_j. \]

It requires taking the complex conjugate of \(y\) in the inner product calculation. The 2-norm of a vector is \(\|x\|_2 = \langle x, x \rangle^{1/2}\).

In complex spaces, the counterpart to transposed matrices is the conjugate transpose \(A^H := \bar{A}^T\). With this, the following relation holds:

\[ \langle Ax, y \rangle = \langle x, A^H y \rangle. \]

Definition 6 (Hermitian matrix)

A matrix is Hermitian (or self-adjoint) if \(A^H = A\).

Consequently, the diagonal elements of a Hermitian matrix are real-valued. Reflecting the altered inner product definition, the concept of orthogonal matrices in complex spaces changes to unitary matrices.

Definition 7 (Unitary matrix)

A matrix \(Q \in \mathbb{C}^{n \times n}\) is unitary if \(Q^H Q = I\).

All unitary matrices satisfy \(|\det(Q)| = 1\).

Characteristic polynomials, algebraic and geometric multiplicities#

Rewriting the eigenvalue problem, we get:

\[ (\lambda I - A)x = 0. \]

Nontrivial solutions \(x\) exist if and only if \(\lambda I - A\) is singular, which is equivalent to:

\[ \det(\lambda I - A) = 0. \]

Definition 8

Let \(A \in \mathbb{C}^{n \times n}\). We define the characteristic polynomial of \(A\) as

\[ p(\lambda) := \det(\lambda I - A). \]

Expanding this determinant reveals that \(p\) is a polynomial with degree \(n\). Let \(\lambda_j\) be the \(j\)th root of \(p\). Then:

\[ p(\lambda) = \prod_j (\lambda - \lambda_j)^{\alpha_j}, \]

where \(\alpha_j\) is the algebraic multiplicity of the \(j\)th root. According to the fundamental theorem of algebra, \(n = \sum_j \alpha_j\), meaning an \(n \times n\) matrix has \(n\) eigenvalues, including multiplicities.

The geometric multiplicity \(\beta_j\) of an eigenvalue \(\lambda_j\) is the dimension of the kernel of \(\lambda_j I - A\). If \(\beta_j = 1\) then \(\lambda_j\) is a simple eigenvalue. Generally, \(\beta_j \leq \alpha_j\). The relationship between algebraic and geometric multiplicities can be further explored through the concept of Jordan normal form, which is briefly described in the optional material.

Eigenvalue decomposition#

If for all \(j\), the algebraic and geometric multiplicities are equal (\(\alpha_j = \beta_j\)), the matrix \(A\) admits an eigenvalue decomposition

\[ A = X\Lambda X^{-1}, \]

where \(X \in \mathbb{C}^{n \times n}\) is a nonsingular matrix with columns \(x_j\) comprising eigenvectors of \(A\), and \(\Lambda\) is a diagonal matrix with diagonal entries being the corresponding eigenvalues \(\lambda_j\). Hermitian matrices have a basis of orthonormal eigenvectors.

The companion matrix#

The companion matrix of a polynomial \(p\) is a special square matrix whose characteristic polynomial is \(p\). Thus, the companion matrix’s eigenvalues are the polynomial’s roots. This makes companion matrices useful in numerical algebra for finding polynomial roots.

Given the polynomial

\[ p(x) = x^n + a_{n-1}x^{n-1} + \cdots + a_2x^2 + a_1x + a_0 \]

of degree \(n\) with leading coefficient \(a_n = 1\), the companion matrix \(C\) of \(p(x)\) is an \(n \times n\) matrix defined as:

\[\begin{split} C = \begin{pmatrix} 0 & 0 & \cdots & 0 & -a_0 \\ 1 & 0 & \cdots & 0 & -a_1 \\ 0 & 1 & \cdots & 0 & -a_2 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \cdots & 1 & -a_{n-1} \end{pmatrix}. \end{split}\]

One can check that \(\det(x I - C) = p(x)\).

Python skills#

Complex numbers#

Working with complex matrices is well-supported in Python.

import numpy as np

# Step 1: Create complex matrices A and B
A = np.array([[1 + 2j, 3 + 4j], [5 + 6j, 7 + 8j]])
B = np.array([[1, 1j], [1j, 1]]) / np.sqrt(2)

# Step 2: Matrix multiplication
product = np.dot(A, B)

# Step 3: Conjugate transpose (Hermitian) of A and B
A_H = np.conjugate(A.T)
B_H = np.conjugate(B.T)

# Step 4: Check if A or B are unitary
is_unitary_A = np.allclose(np.dot(A_H, A), np.eye(A.shape[0]))
is_unitary_B = np.allclose(np.dot(B_H, B), np.eye(B.shape[0]))

# Displaying the results
print("Matrix A:\n", A)
print("Matrix B:\n", B)
print("Product of A and B:\n", product)
print("Conjugate transpose (Hermitian) of A:\n", A_H)
print("Conjugate transpose (Hermitian) of B:\n", B_H)
print("Is Matrix A unitary?:", is_unitary_A)
print("Is Matrix B unitary?:", is_unitary_B)

Eigenvalue decomposition#

Shortly, we will be delving into constructing numerical methods to compute eigenvalues and eigenvectors. Here, we look at a simple example, which uses Numpy’s eigensystem solver as a black box.

import numpy as np

# Define a complex matrix A
A = np.array([[2+1j, 4-1j], [4+1j, 3-2j]])

# Solve the eigenvalue problem
eigenvalues, eigenvectors = np.linalg.eig(A)

# Display eigenvalues and eigenvectors
print("Eigenvalues of A:", eigenvalues)
print("Eigenvectors of A:\n", eigenvectors)

# Eigenvalue decomposition (if A is diagonalizable)
if np.linalg.matrix_rank(eigenvectors) == len(A):
    Lambda = np.diag(eigenvalues)
    X = eigenvectors
    X_inv = np.linalg.inv(X)
    A_reconstructed = X @ Lambda @ X_inv
    print("Reconstructed A from its eigenvalue decomposition:\n", A_reconstructed)
else:
    print("A cannot be diagonalized with a full set of eigenvectors.")

Self-check questions#

Question

Show that all unitary matrices \(Q\) satisfy \(|\det(Q)| = 1\).

Answer

From \(Q^H Q = I\), we conclude

\[ 1 = \det(I) = \det(Q^H Q) = \det(Q^H) \det(Q) = \overline{\det(Q)} \det(Q) = |\det(Q)|^2. \]

It follows that \(|\det(Q)| = 1\).

Question

Analyze the matrix \(A\) given by:

\[\begin{split} A = \begin{bmatrix} 3 & 1 & 0 \\ 0 & 3 & 1 \\ 0 & 0 & 3 \end{bmatrix}. \end{split}\]

Determine the eigenvalues and eigenvectors of \(A\).

Compute the algebraic and geometric multiplicities of each eigenvalue.

Can \(A\) be decomposed as \(A = X \Lambda X^{-1}\) where \(\Lambda\) is a diagonal matrix and \(X\) non-singular? Explain why or why not.

Answer

We solve \(\det(A - \lambda I) = 0\) to find the eigenvalues. The only possible eigenvalue in this case is \(\lambda = 3\).

For \(\lambda = 3\), the equation \((A - \lambda I)x = 0\) simplifies to

\[\begin{split} \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix}x = 0. \end{split}\]

The solution space is spanned by \((1, 0, 0)^T\).

The eigenvalue \(\lambda = 3\) appears three times on the diagonal of \(A\), so its algebraic multiplicity is 3. The solution to the eigenvector equation yields only one linearly independent eigenvector, so the geometric multiplicity of \(\lambda = 3\) is \(1\).

Since the algebraic and geometric multiplicities of the eigenvalue are not equal, \(A\) cannot be decomposed as \(A = X\Lambda X^{-1}\). For such a decomposition to exist, the matrix must have a full set of linearly independent eigenvectors.

Question

Prove each eigenvalue of a unitary matrix has modulus \(1\).

Answer

An eigenvector \(v \neq 0\) of a unitary matrix \(Q\) with eigenvalue \(\lambda\) satisfies \(Q v = \lambda v\). Furthermore, \(v^H Q^H = \overline{\lambda} v^H\), noting that \(\lambda\) is a scalar and so its transpose. Multiplying both identities with each other gives

\[ (v^H Q^H) (Q v) = (\overline{\lambda} v^H) (\lambda v). \]

Using \(Q^H Q = I\), we arrive at \(\| v \|_2^2 = |\lambda|^2 \| v \|_2^2\), implying \(|\lambda| = 1\).

Optional material#

Eigenvalues and ordinary differential equations

Eigenvalue problems are fundamentally important in various fields, including the study of differential equations. For instance, consider solving the ordinary differential equation (ODE):

\[ y' = Ay, \]

where \(A \in \mathbb{C}^{n \times n}\) and \(y(0) = y_0\). If \(y_0\) is an eigenvector of \(A\) with eigenvalue \(\lambda\), then the solution of the ODE is given by \(y(t) = y_0 e^{\lambda t}\):

\[ y'(t) = \lambda y_0 e^{\lambda t} = A y_0 e^{\lambda t} = A y(t). \]

This perspective emphasises eigenvalues as not merely properties of matrices but as fundamental characteristics of dynamical systems governed by the matrix \(A\). This view contrasts singular values, which are more about the mapping attributes of a matrix \(A\).

Jordan normal form

The Jordan normal form (or Jordan canonical form) is a way of simplifying a square matrix into a specific upper triangular form through a similarity transformation. This form is useful because it provides a simplified representation of the matrix that retains its essential features, especially its eigenvalues and the geometric structure associated with them.

Here are the key aspects of the Jordan normal form:

  1. Structure: A matrix in Jordan normal form is a block diagonal matrix where each block is a Jordan block. A Jordan block is an upper triangular matrix of the form:

    \[\begin{split} J = \begin{bmatrix} \lambda & 1 & 0 & \cdots & 0 \\ 0 & \lambda & 1 & \cdots & 0 \\ 0 & 0 & \lambda & \ddots & \vdots \\ \vdots & \vdots & \ddots & \ddots & 1 \\ 0 & 0 & \cdots & 0 & \lambda \end{bmatrix}. \end{split}\]

    Here, \(\lambda\) is an eigenvalue of the original matrix.

  2. Eigenvalues: The diagonal entries of each Jordan block are the eigenvalues of the matrix. If an eigenvalue has a higher algebraic multiplicity, it appears in multiple blocks or in a larger block.

  3. Geometric Multiplicity: The number of Jordan blocks corresponding to each eigenvalue is the geometric multiplicity of that eigenvalue.

  4. Existence: Every square matrix over an algebraically closed field (like the complex numbers) can be put into Jordan normal form. However, finding the Jordan normal form of a matrix can be computationally difficult and is sensitive to perturbations in the matrix entries.

  5. Limitation: Due to instability, the Jordan normal form is often avoided for numerical computations. Instead, other forms like the Schur decomposition are used.

Perturbation results for eigenvalue problems

Consider a matrix \(A\in\mathbb{C}^{n\times n}\) and an eigenpair \((\lambda, x)\) of \(A\). Let \(y\) be the corresponding left-eigenvector, satisfying \(y^HA = \lambda y^H\).

We wish to investigate the effect of a perturbation \(\delta A\) of \(A\) on the eigenvalues and eigenvectors of the matrix. Suppose that \(\lambda\) is a simple eigenvalue. Because eigenvalues are roots of the characteristic polynomial, one can show that perturbing \(A\) by \(\delta A\) creates a perturbation \(\delta \lambda\) of the eigenvalue and \(\delta x\) of the eigenvector, which smoothly depends on \(\delta A\) for small perturbations. The perturbed eigenvalue equation is

\[ (A + \delta A)(x+ \delta x) = (\lambda + \delta \lambda)(x + \delta x). \]

Expanding the equation, we obtain \(A \, \delta x + \delta A \, x + \delta A \, \delta x = \lambda \, \delta x + \delta \lambda \, x + \delta \lambda \, \delta x\). We now multiply from the left with \(y^H\) and use that \(y^HA = \lambda y^H\), resulting in \(y^H \, \delta A \, x + y^H \, \delta A \, \delta x = \delta \lambda \, y^H \, x + \delta \lambda \, y^H \, \delta x\). Solving for \(\delta \lambda\), we arrive at

\[ \delta\lambda = \frac{y^H \, \delta A \, x + y^H \, \delta A \, \delta x - \delta \lambda \, y^H \, \delta x}{y^H \, x}. \]

The absolute condition number with respect to the \(2\) norm is, for \(\lambda \neq 0\) and \(A \neq 0\),

\[\begin{split} \begin{align*} \kappa_{abs}(\lambda) = & \lim_{\delta \to 0} \sup_{\| \delta A \|_2 \leq \delta} \Bigl( \frac{|\delta \lambda|}{\| \delta A \|_2} \Bigr) = \lim_{\delta \to 0} \sup_{\| \delta A \|_2 \leq \delta} \Bigl( \frac{|y^H \, \delta A \, x / (y^H \, x)|}{\| \delta A \|_2} \Bigr)\\ = & \sup_{\| \delta A \|_2 \leq 1} \Bigl( \frac{|y^H \, \delta A \, x / (y^H \, x)|}{\| \delta A \|_2} \Bigr) = \sup_{\| \delta A \|_2 \leq 1} \bigl( |y^H \, \delta A \, x / (y^H \, x)| \bigr) = \frac{\|y\| \|x\|}{|y^Hx|}. \end{align*} \end{split}\]

Here, we used that \(y^H \, \delta A \, \delta x - \delta \lambda \, y^H \, \delta x\) are higher-order terms and, for the last equality, that \(\delta A\) may be chosen to reflect \(x\) such that \(\delta A \, x\) is collinear to \(y\). We find that \(\kappa_{abs}(\lambda)\) approaches infinity as \(x\) and \(y\) approach orthogonal directions.

We now turn our attention to a multiple eigenvalue. Let’s assume for simplicity that its algebraic and geometric multiplicities are equal. This situation gives rise to two right eigenvectors, \(x_1\) and \(x_2\), as well as two left eigenvectors, \(y_1\) and \(y_2\). Furthermore, any non-zero linear blend of \(\alpha_1 x_1 + \alpha_2 x_2\) forms a right eigenvector and a similar combination can be made with \(y_1\) and \(y_2\) for left eigenvectors.

We can determine the orthogonal relationship by solving:

\[ y_1^H(x_1 - \alpha x_2) = 0, \]

leading to \(y_1 \bot (x_1 - \frac{y_1^H x_1}{y_1^H x_2}x_2)\). Therefore, in the presence of multiple eigenvalues, an orthogonal pair of left and right eigenvectors exists, indicating that multiple eigenvalues are unstable to perturbations. This is also reflected in the observation that if we have a polynomial with a multiple root, then a small arbitrary perturbation in the coefficients will turn a multiple root into several simple roots, qualitatively changing the nature of the eigensystem.

For a more comprehensive and formal discussion on the perturbations of eigensystems, particularly in cases of multiple eigenvalues and their projections onto corresponding eigenvector subspaces, see Chapter 2 of [Kato, 1966].

We conclude with a statement of the Bauer-Fike Theorem.

Fact: Bauer-Fike Theorem

Consider a diagonalizable matrix \(A\in\mathbb{C}^{n\times n}\) with the similarity transformation \(A = X \, \Lambda \, X^{-1}\), where \(X\) is invertible and \(\Lambda\) diagonal. For every perturbation \(\delta A\) and every eigenvalue \(z\) of \(A+\delta A\) there exists an eigenvalue \(\lambda\) of \(A\) such that:

\[ |z-\lambda| \leq \|X\|_2 \|X^{-1}\|_2 \|\delta A\|_2. \]

In the above notation, we may write this more succinctly as

\[ |\delta \lambda| \leq \kappa_{rel}(X) \| \delta A \|_2, \]

where \(\kappa_{rel}(A) = \|X\|_2 \|X^{-1}\|_2\). This theorem relates the perturbation of eigenvalues to the condition number of the eigenvector matrix \(X\), offering a maximum bound for eigenvalue perturbations. For a Hermitian matrix \(A\), the eigenvector matrix is unitary, implying that the condition number is 1, simplifying the Bauer-Fike theorem.