In this article, I will prove that every matrix has a singular value decomposition. The singular value decomposition has numerous applications. It can be used to compute the pseudoinverse of a matrix, to perform principal component analysis, and it can be used to approximate a matrix M by a low-rank approximation M~.
Theorem: Any matrix M∈Cm×n of rank r has a decomposition M=UΣV∗
where U∈Cm×r and V∈Cn×r are semi-unitary matrices, and Σ∈Rr×r is a diagonal matrix with positive elements on the diagonal.
Proof: If m>n, we can look at the singular value decomposition M∗=UΣV∗ of the conjugate transpose M∗ of M. Using M=(M∗)∗=(UΣV∗)∗=VΣU∗, we find the singular value decomposition of M. So we can assume m≤n without loss of generality.
Now consider the n×n matrix M∗M, which is Hermitian. Using the spectral theorem for Hermitian matrices, we have M∗M=WΛW∗
where W is a unitary matrix with n orthonormal eigenvectors v1,v2,...,vn of M∗M, and Λ is a diagonal matrix, where the kth element λk on the diagonal is the eigenvalue of vk. According to the first lemma, all the λk are nonnegative.
We can assume that λ1≥λ2≥...≥λn. If this is not the case, we can simply permute the columns of V and Σ so that this condition holds. Since all the eigenvectors are orthogonal, and the rank of M∗M is r by the second lemma, the eigenvalues λ1,λ2,...,λr are positive, and λr+1,λr+2,λn are zero.
Now define σk=λk
uk=σkAvk
for k=1,2,...,r
Let U∈Cm×r be the matrix with u1,u2,...,ur, let V∈Cn×r be the matrix with v1,v2,...,vr as columns, and let Σ∈Rr×r be the diagonal matrix with σ1,σ2,...,σr on the diagonal.
For 1≤k≤r, we have that V∗vk=ek. So Mvk=UΣV∗vk=Uσkek=σkUek=σkuk=Avk. On the other hand, if r<k≤n, we have V∗vk=0, so UΣV∗vk=0. It follows that Mvk=UΣV∗vk for k=1,2,...,n. Since the vectors v1,v2,...,vn are orthonormal, they form a basis basis of Cn. By linearity, it follows that Mv=UΣVv for any v∈Cn. So we must have M=UΣV∗
It remains to show that Σ is a diagonal matrix with positive diagonal elements, and that U and V are semi-unitary. The first fact is obvious since we have defined Σ to have the values λ1,λ2,...λr on the diagonal.
To show that U is semi-unitary we need to show that U∗U=I. Consider that the elements U∗U are given by ui∗uj=σiσj(Avi)∗Avj=σiσjvi∗(A∗Avj)=λjσiσj⟨vi,vj⟩ for 1≤i,j≤r. This expression reduces to 1 if i=j and to 0 otherwise (by the orthogonality of the vectors v1,v2,...,vr. So U∗U=Ir, which means that U is semi-unitary.
V is an n×r matrix with orthonormal columns. It follows that V∗V=Ir, so V is semi-unitary. □
Lemma: M∗M has only real, nonnegative eigenvalues.
Proof: Let v be any eigenvector of M∗M and λ be the corresponding eigenvalue, so that M∗Mv=λv. We have ∣∣Mv∣∣2=v∗M∗Mv=λv∗v=λ∣∣v∣∣2. Rearranging for λ, we get λ=∣∣v∣∣2∣∣Mv∣∣2, which is obviously real and nonnegative. □
Lemma: Let r be the rank of a complex matrix M. Then M∗M has rank r as well.
Proof: For a matrix M∈Cm×n we have rank(M)=n−dim(null(M)), so it suffices to show that Mv=0⟺M∗Mv=0, since this implies that the null spaces are the same.
From Mv=0 it follows that M∗Mv=M∗0=0. It is a property of the norm that ∣∣v∣∣2=0 implies v=0. So M∗Mv=∣∣Mv∣∣2=0 implies v=0. □
The theorem proves the existence of a variant of the singular value decomposition that is also known as the compact singular value decomposition. The existence of the 'full' singular value decomposition and other variants follows from the existence of the compact singular value decomposition, since we can just add extra zeroes to the diagonal, and add more orthonormal columns to U and V, without changing the product UΣV∗.