The spectral theorem for Hermitian matrices

A spectral theorem is a theorem about the diagonalization of a matrix or linear operator. A matrix is diagonalizable if it can be written in the form $M D M^{- 1}$ where $D$ is a diagonal matrix. In this article, I will explain what a Hermitian matrix is, derive some properties, and use them to prove a spectral theorem for Hermitian matrices.

In the rest of the article, I will use the usual inner product on the complex vector space $C^{n}$ :

⟨ u, v ⟩ = u^{⊤} \overline{v} = k = 1 \sum n u_{k} \overline{v_{k}}

and the corresponding norm:

∣∣ z ∣∣ = ⟨ z, z ⟩ = k = 1 \sum n z_{k} \overline{z_{k}}

We will often use that the inner product is linear in its first argument, and conjugate linear in its second:

⟨ λ u, v ⟩ = λ ⟨ u, v ⟩

⟨ u, λ v ⟩ = \overline{λ} ⟨ u, v ⟩

Here, $\overline{z}$ denotes the complex conjugate, which is defined by $\overline{x + i y} = x - i y$ for real $x, y \in R$ . These are straightforward generalizations of the normal Euclidian inner product and norm on real vector spaces. In particular, this inner product equals the ‘normal’ inner product for real vectors.

Hermitian operators

Now, we are ready to define Hermitian operators:

Definition: A Hermitian or self-adjoint operator $A$ on a space $X$ with an inner product $⟨ \cdot, \cdot ⟩ : X \times X \to R$ is an operator for which

⟨ A x, y ⟩ = ⟨ x, A y ⟩

for all $x, y \in X$

By this definition, symmetric matrices with real elements are Hermitian. However, for matrices with complex elements, the condition is slightly different due to the complex conjugation in the second argument of the inner product.

The conjugate transpose $A^{*}$ of a complex matrix $A$ is defined by $A^{*} = \overline{A^{⊤}}$ .

Theorem: A matrix $A$ with complex elements is Hermitian if and only if

A = A^{*}

Proof: We have $⟨ A x, y ⟩ = x^{⊤} A^{⊤} \overline{y}$ and $⟨ x, A y ⟩ = x^{⊤} \overline{A} \overline{y}$ , so $⟨ A x, y ⟩ = ⟨ x, A y ⟩ ⟺ x^{⊤} A^{⊤} \overline{y} = x^{⊤} \overline{A} \overline{y}$ . This equality can only hold for all $x, y \in X$ if $A^{⊤} = \overline{A}$ . Taking transposes from both sides, we see that this holds if and only if $A = A^{*}$ . $□$

I want to emphasize that Hermicity can be seen as a generalization of symmetry: We have $\overline{A} = A$ if $A$ is a matrix with real elements, so every symmetric matrix with real elements is Hermitian.

The spectral theorem for Hermitian matrices

Hermitian matrices have some pleasing properties, which can be used to prove a spectral theorem.

Lemma: The eigenvectors of a Hermitian matrix $A \in C^{n \times n}$ have real eigenvalues.

Proof: Let $v$ be an eigenvector with eigenvalue $λ$ . Then $λ ⟨ v, v ⟩ = ⟨ λ v, v ⟩ = ⟨ A v, v ⟩ = ⟨ v, A v ⟩ = ⟨ v, λ v ⟩ = \overline{λ} ⟨ v, v ⟩$ . It follows that $λ = \overline{λ}$ , so $λ$ must be real. $□$

Recall that two vectors $x$ and $y$ are orthogonal if their inner product is zero, that is, $⟨ x, y ⟩ = 0$ , that a set of vectors $V$ is orthogonal if every pair $v_{1}, v_{2}$ with $v_{1} \neq = v_{2}$ is orthogonal, and that it is orthonormal if it is orthogonal and every vector $v \in V$ has unit norm, that is, $∣∣ v ∣∣ = 1$ .

We will need some lemmas to prove the main result later on. The first is a simple result that states that vectors orthogonal to eigenvectors stay orthogonal when multiplied by $A$ .

Lemma: If $x$ is orthogonal to an eigenvector $v$ of a Hermitian matrix $A$ , then $A x$ is orthogonal to $v$ as well.

Proof: Suppose that $λ$ is the eigenvalue associated to $v$ . Then $⟨ A x, v ⟩ = ⟨ x, A v ⟩ = ⟨ x, λ v ⟩ = \overline{λ} ⟨ x, v ⟩ = 0$ . So $⟨ A x, v ⟩ = 0$ , which means that $A x$ and $v$ are orthogonal. $□$

The second lemma is about the behavior of matrices with orthogonal rows.

Lemma: Let $U \in C^{m \times n}$ be a matrix with $m \leq n$ orthonormal rows $u_{1}, u_{2}, ..., u_{m}$ , and $S$ be the space spanned by these vectors. Then

$U U^{*} = I_{m}$
$U^{*} Uv = v$ for all $v \in S$

Proof: Interpret the vectors $u_{1}, u_{2}, ..., u_{m}$ as column vectors. Then the element at $i, j$ of $U U^{*}$ is $u_{i}^{⊤} \overline{u_{j}} = ⟨ u_{i}, u_{j} ⟩$ . By the orthonormality of $u_{1}, u_{2}, ..., u_{n}$ it follows that this expression is $1$ when $i = j$ (that is, the element is on the diagonal), and $0$ otherwise. So $U U^{*}$ equals $I_{m}$ , the identity matrix of size $m \times m$ .

For the second result, assume that $v \in S$ . Then $v$ is a linear combination of the rows in $U$ , or, equivalently, a linear combination of the columns of $U^{*}$ . So we can write $v = U^{*} w$ for some $w \in C^{m}$ . Then, using the first part of the lemma, we have:

U^{*} Uv = U^{*} U U^{*} w = U^{*} I_{m} w = U^{*} w = v

$□$

With these results we are finally ready to prove the existence of an orthogonal basis of eigenvectors.

Theorem: A Hermitian matrix $A \in C^{n \times n}$ has $n$ orthogonal eigenvectors.

Proof: We use induction on the number of eigenvalues of $A \in C^{n \times n}$ . The characteristic equation $det (A - λ I) = 0$ is a complex polynomial equation of order $n$ , and has a solution in $λ$ . That implies that for this $λ$ , $A - λ$ is singular, so there exists a $v$ such that $(A - λ I) v = 0$ . This implies that $A v = λ v$ , so we have a set of one eigenvector $v$ , which is orthogonal. This proves the base case.

For the induction step, assume the existence of $n - m$ (with $m < n$ ) orthogonal eigenvectors $v_{1}, v_{2}, ..., v_{n - m}$ . We then need to prove the existence of another eigenvector $v$ , which is orthogonal to $v_{1}, v_{2}, ..., v_{n - m}$ . Let $u_{1}, u_{2}, ..., u_{m}$ be an orthonormal basis of the space that is orthogonal to all the eigenvectors $v_{1}, v_{2}, ..., v_{n - m}$ , and $U$ be the matrix with $u_{1}, u_{2}, ..., u_{m}$ as its rows. Now, $U A U^{*}$ is Hermitian, so as we just proved for the base case, it must have at least one eigenvector $w$ with eigenvalue $λ$ . So we have

U A U^{*} w = λ w

Multiplying both sides by $U^{*}$ on the left gives $U^{*} U A v = λ U^{*} w$ . Now define $v := U^{*} w$ and substitute to get

U^{*} U A v = λ v

Now, since $v = U^{*} w$ is a linear combination of the columns of $U^{*}$ , it is orthogonal to all the eigenvectors $v_{1}, v_{2}, ..., v_{n - m}$ . So, by the first lemma, $A v$ is also orthogonal to all these eigenvalues. This means that $A v$ is a linear combination of $u_{1}, u_{2}, ..., u_{m}$ as well. By the second lemma, it follows that $U^{*} U A v = A v$ . So we are left with

A v = λ v

So $v$ is an eigenvector of $A$ . Moreover, since $v = U^{*} w$ , $v$ is a linear combination of $u_{1}, u_{2}, ..., u_{m}$ , so it is orthogonal to the eigenvectors $v_{1}, v_{2}, ..., v_{n - m}$ . So this completes the induction step. $□$

Of course, it is now easy to make this basis orthonormal by scaling the vectors in the basis.

Corollary: A Hermitian matrix $A$ has a basis of orthonormal eigenvectors.

Proof: By the preceding theorem, there exists a basis of $n$ orthogonal eigenvectors of $A$ . Denote this basis with $x_{1}, x_{2}, .., x_{n}$ , and define $y_{k} = \frac{x _{k}}{∣∣ x _{k} ∣∣}$ . Now, $⟨ y_{i}, y_{j} ⟩ = \frac{⟨ x _{i} , x _{j} ⟩}{∣∣ x _{i} ∣∣ ∣∣ x _{j} ∣∣}$ , which is $0$ when $i \neq = j$ and $1$ when $i = j$ . So this basis is orthonormal.

Definition: A unitary matrix $U$ is a matrix for which $U^{- 1} = U^{*}$ .

Theorem (Spectral theorem for Hermitian matrices): A Hermitian matrix $A \in C^{n \times n}$ can be written as

A = U Λ U^{*}

where $U$ is a unitary matrix, and $Λ$ is a diagonal matrix with nonnegative elements.

Proof: Let $u_{1}, u_{2}, ..., u_{n}$ be an orthonormal basis of eigenvector, and $λ_{1}, λ_{2}, ..., λ_{n}$ be the corresponding eigenvalues. Now, take $U$ to be the matrix with $u_{k}$ as the $k$ th column, and $Λ$ to be the matrix with $λ_{k}$ as the $k$ th element on the diagonal.

To prove that $U$ is unitary, consider the element at position $i, j$ of the matrix $U U^{*}$ . It is given by $⟨ u_{i}, u_{j} ⟩$ , which is $1$ when $i = j$ and $0$ otherwise. So the elements on the diagonal of $U U^{*}$ are one and the others zero, which means that $U U^{*} = I$ . Furthermore, we have $U^{*} U = (U U^{*})^{*} = I$ . So $U^{- 1} = U^{*}$ , and $U$ is unitary.

To prove that $A = U Λ U^{*}$ , consider the effect of left multiplying an eigenvector by this expression:

U Λ U^{*} v_{k} = U Λ e_{k} = U λ_{k} e_{k} = λ_{k} v_{k} = A v_{k}

Since $v_{1}, v_{2}, ..., v_{n}$ is a basis of $C^{n}$ , every vector $x \in C^{n}$ can be written as a linear combination of the vectors $v_{1}, v_{2}, ..., v_{n}$ . So we have $U Λ U^{*} x = A x$ for every $x \in C^{n}$ . It follows that $A = U Λ U^{*}$ . $□$

With this, we finally proved the spectral theorem for Hermitian matrices. While the theorem itself is certainly interesting enough to prove, the proof has other benefits as well. First, there is a spectral theorem for unitary matrices as well, and the proof is analogous to this proof. Secondly, the spectral theorem for Hermitian matrices can be used to easily prove the existence of the singular value decomposition.