The spectral theorem for Hermitian matrices

A spectral theorem is a theorem about the diagonalization of a matrix or linear operator. A matrix is diagonalizable if it can be written in the form MDM1MDM^{-1} where DD is a diagonal matrix. In this article, I will explain what a Hermitian matrix is, derive some properties, and use them to prove a spectral theorem for Hermitian matrices.

In the rest of the article, I will use the usual inner product on the complex vector space Cn\mathbb{C}^n:
<u,v>=uv=k=1nukvk \left< u, v \right> = u^\top \overline{v} = \sum_{k = 1}^n u_k \overline{v_k}

and the corresponding norm:
z=<z,z>=k=1nzkzk || z || = \sqrt{ \left< z, z \right> } = \sqrt{ \sum_{k = 1}^n z_k \overline{z_k} }

We will often use that the inner product is linear in its first argument, and conjugate linear in its second:
<λu,v>=λ<u,v> \left< \lambda u, v \right> = \lambda \left< u, v \right>

<u,λv>=λ<u,v> \left< u, \lambda v \right> = \overline{\lambda} \left< u, v \right>

Here, z\overline{z} denotes the complex conjugate, which is defined by x+iy=xiy\overline{x + iy} = x - iy for real x,yRx, y \in \mathbb{R}. These are straightforward generalizations of the normal Euclidian inner product and norm on real vector spaces. In particular, this inner product equals the ‘normal’ inner product for real vectors.

Hermitian operators

Now, we are ready to define Hermitian operators:

Definition: A Hermitian or self-adjoint operator AA on a space XX with an inner product <,>:X×XR\left< \cdot, \cdot \right> : X \times X \rightarrow \mathbb{R} is an operator for which
<Ax,y>=<x,Ay> \left< Ax, y \right> = \left< x, Ay \right>

for all x,yXx, y \in X

By this definition, symmetric matrices with real elements are Hermitian. However, for matrices with complex elements, the condition is slightly different due to the complex conjugation in the second argument of the inner product.

The conjugate transpose AA^* of a complex matrix AA is defined by A=AA^* = \overline{A^\top}.

Theorem: A matrix AA with complex elements is Hermitian if and only if
A=A A = A^*

Proof: We have <Ax,y>=xAy\left< Ax, y \right> = x^\top A^\top \overline{y} and <x,Ay>=xAy\left< x, Ay \right> = x^\top \overline{A} \overline{y}, so <Ax,y>=<x,Ay>    xAy=xAy\left< Ax, y \right> = \left< x, Ay \right> \iff x^\top A^\top \overline{y} = x^\top \overline{A} \overline{y}. This equality can only hold for all x,yXx, y \in X if A=AA^\top = \overline{A}. Taking transposes from both sides, we see that this holds if and only if A=AA = A^*. \square

I want to emphasize that Hermicity can be seen as a generalization of symmetry: We have A=A\overline{A} = A if AA is a matrix with real elements, so every symmetric matrix with real elements is Hermitian.

The spectral theorem for Hermitian matrices

Hermitian matrices have some pleasing properties, which can be used to prove a spectral theorem.

Lemma: The eigenvectors of a Hermitian matrix ACn×nA \in \mathbb{C}^{n \times n} have real eigenvalues.

Proof: Let vv be an eigenvector with eigenvalue λ\lambda. Then λ<v,v>=<λv,v>=<Av,v>=<v,Av>=<v,λv>=λ<v,v>\lambda \left< v, v \right> = \left< \lambda v, v \right> = \left<Av, v \right> = \left< v, Av \right> = \left< v, \lambda v \right> = \overline{\lambda} \left< v, v \right>. It follows that λ=λ\lambda = \overline{\lambda}, so λ\lambda must be real. \square

Recall that two vectors xx and yy are orthogonal if their inner product is zero, that is, <x,y>=0\left< x, y \right> = 0, that a set of vectors VV is orthogonal if every pair v1,v2v_1, v_2 with v1v2v_1 \not= v_2 is orthogonal, and that it is orthonormal if it is orthogonal and every vector vVv \in V has unit norm, that is, v=1||v|| = 1.

We will need some lemmas to prove the main result later on. The first is a simple result that states that vectors orthogonal to eigenvectors stay orthogonal when multiplied by AA.

Lemma: If xx is orthogonal to an eigenvector vv of a Hermitian matrix AA, then AxAx is orthogonal to vv as well.

Proof: Suppose that λ\lambda is the eigenvalue associated to vv. Then <Ax,v>=<x,Av>=<x,λv>=λ<x,v>=0\left< Ax, v \right> = \left< x, Av \right> = \left< x, \lambda v \right> = \overline{\lambda} \left< x, v \right> = 0. So <Ax,v>=0\left< Ax, v \right> = 0, which means that AxAx and vv are orthogonal. \square

The second lemma is about the behavior of matrices with orthogonal rows.

Lemma: Let UCm×nU \in \mathbb{C}^{m \times n} be a matrix with mnm \leq n orthonormal rows u1,u2,...,umu_1, u_2, ..., u_m, and SS be the space spanned by these vectors. Then

  1. UU=ImU U^* = I_m
  2. UUv=vU^* U v = v for all vSv \in S

Proof: Interpret the vectors u1,u2,...,umu_1, u_2, ..., u_m as column vectors. Then the element at i,ji, j of UUU U^* is uiuj=<ui,uj>u_i^\top \overline{u_j} = \left< u_i, u_j \right>. By the orthonormality of u1,u2,...,unu_1, u_2, ..., u_n it follows that this expression is 11 when i=ji = j (that is, the element is on the diagonal), and 00 otherwise. So UUUU^* equals ImI_m, the identity matrix of size m×mm \times m.

For the second result, assume that vSv \in S. Then vv is a linear combination of the rows in UU, or, equivalently, a linear combination of the columns of UU^*. So we can write v=Uwv = U^* w for some wCmw \in \mathbb{C}^m. Then, using the first part of the lemma, we have:
UUv=UUUw=UImw=Uw=vU^* U v = U^* U U^* w = U^* I_m w = U^* w = v

\square

With these results we are finally ready to prove the existence of an orthogonal basis of eigenvectors.

Theorem: A Hermitian matrix ACn×nA \in \mathbb{C}^{n \times n} has nn orthogonal eigenvectors.

Proof: We use induction on the number of eigenvalues of ACn×nA \in \mathbb{C}^{n \times n}. The characteristic equation det(AλI)=0\det(A - \lambda I) = 0 is a complex polynomial equation of order nn, and has a solution in λ\lambda. That implies that for this λ\lambda, AλA - \lambda is singular, so there exists a vv such that (AλI)v=0(A - \lambda I)v = 0. This implies that Av=λvAv = \lambda v, so we have a set of one eigenvector vv, which is orthogonal. This proves the base case.

For the induction step, assume the existence of nmn - m (with m<nm < n) orthogonal eigenvectors v1,v2,...,vnmv_1, v_2, ..., v_{n - m}. We then need to prove the existence of another eigenvector vv, which is orthogonal to v1,v2,...,vnmv_1, v_2, ..., v_{n - m}. Let u1,u2,...,umu_1, u_2, ..., u_m be an orthonormal basis of the space that is orthogonal to all the eigenvectors v1,v2,...,vnmv_1, v_2, ..., v_{n - m}, and UU be the matrix with u1,u2,...,umu_1, u_2, ..., u_m as its rows. Now, UAUU A U^* is Hermitian, so as we just proved for the base case, it must have at least one eigenvector ww with eigenvalue λ\lambda. So we have
UAUw=λwU A U^* w = \lambda w

Multiplying both sides by UU^* on the left gives UUAv=λUwU^* U A v = \lambda U^* w. Now define v:=Uwv := U^*w and substitute to get
UUAv=λv U^* U A v = \lambda v

Now, since v=Uwv = U^* w is a linear combination of the columns of UU^*, it is orthogonal to all the eigenvectors v1,v2,...,vnmv_1, v_2, ..., v_{n - m}. So, by the first lemma, AvA v is also orthogonal to all these eigenvalues. This means that AvA v is a linear combination of u1,u2,...,umu_1, u_2, ..., u_m as well. By the second lemma, it follows that UUAv=AvU^* U Av = Av. So we are left with
Av=λv Av = \lambda v

So vv is an eigenvector of AA. Moreover, since v=Uwv = U^* w, vv is a linear combination of u1,u2,...,umu_1, u_2, ..., u_m, so it is orthogonal to the eigenvectors v1,v2,...,vnmv_1, v_2, ..., v_{n - m}. So this completes the induction step. \square

Of course, it is now easy to make this basis orthonormal by scaling the vectors in the basis.

Corollary: A Hermitian matrix AA has a basis of orthonormal eigenvectors.

Proof: By the preceding theorem, there exists a basis of nn orthogonal eigenvectors of AA. Denote this basis with x1,x2,..,xnx_1, x_2, .., x_n, and define yk=xkxky_k = \frac{x_k}{||x_k||}. Now, <yi,yj>=<xi,xj>xi xj\left< y_i, y_j \right> = \frac{\left< x_i, x_j \right>}{|| x_i || \ || x_j ||}, which is 00 when iji \not= j and 11 when i=ji = j. So this basis is orthonormal.

Definition: A unitary matrix UU is a matrix for which U1=UU^{-1} = U^*.

Theorem (Spectral theorem for Hermitian matrices): A Hermitian matrix ACn×nA \in \mathbb{C}^{n \times n} can be written as
A=UΛU A = U \Lambda U^*
where UU is a unitary matrix, and Λ\Lambda is a diagonal matrix with nonnegative elements.

Proof: Let u1,u2,...,unu_1, u_2, ..., u_n be an orthonormal basis of eigenvector, and λ1,λ2,...,λn\lambda_1, \lambda_2, ..., \lambda_n be the corresponding eigenvalues. Now, take UU to be the matrix with uku_k as the kkth column, and Λ\Lambda to be the matrix with λk\lambda_k as the kkth element on the diagonal.

To prove that UU is unitary, consider the element at position i,ji, j of the matrix UUU U^*. It is given by <ui,uj>\left< u_i, u_j \right>, which is 11 when i=ji = j and 00 otherwise. So the elements on the diagonal of UUU U^* are one and the others zero, which means that UU=IU U^* = I. Furthermore, we have UU=(UU)=IU^* U = (U U^*)^* = I. So U1=UU^{-1} = U^*, and UU is unitary.

To prove that A=UΛUA = U \Lambda U^*, consider the effect of left multiplying an eigenvector by this expression:
UΛUvk=UΛek=Uλkek=λkvk=Avk U \Lambda U^* v_k = U \Lambda e_k = U \lambda_k e_k = \lambda_k v_k = A v_k

Since v1,v2,...,vnv_1, v_2, ..., v_n is a basis of Cn\mathbb{C}^n, every vector xCnx \in \mathbb{C}^n can be written as a linear combination of the vectors v1,v2,...,vnv_1, v_2, ..., v_n. So we have UΛUx=AxU \Lambda U^* x = Ax for every xCnx \in \mathbb{C}^n. It follows that A=UΛUA = U \Lambda U^*. \square

With this, we finally proved the spectral theorem for Hermitian matrices. While the theorem itself is certainly interesting enough to prove, the proof has other benefits as well. First, there is a spectral theorem for unitary matrices as well, and the proof is analogous to this proof. Secondly, the spectral theorem for Hermitian matrices can be used to easily prove the existence of the singular value decomposition.