The singular value decomposition

In this article, I will prove that every matrix has a singular value decomposition. The singular value decomposition has numerous applications. It can be used to compute the pseudoinverse of a matrix, to perform principal component analysis, and it can be used to approximate a matrix MM by a low-rank approximation M~\tilde{M}.

Theorem: Any matrix MCm×nM \in \mathbb{C}^{m \times n} of rank rr has a decomposition
M=UΣV M = U \Sigma V^*

where UCm×rU \in \mathbb{C}^{m \times r} and VCn×rV \in \mathbb{C}^{n \times r} are semi-unitary matrices, and ΣRr×r\Sigma \in \mathbb{R}^{r \times r} is a diagonal matrix with positive elements on the diagonal.

Proof: If m>nm > n, we can look at the singular value decomposition M=UΣVM^* = U \Sigma V^* of the conjugate transpose MM^* of MM. Using M=(M)=(UΣV)=VΣUM = (M^*)^* = (U \Sigma V^*)^* = V \Sigma U^*, we find the singular value decomposition of MM. So we can assume mnm \leq n without loss of generality.

Now consider the n×nn \times n matrix MMM^* M, which is Hermitian. Using the spectral theorem for Hermitian matrices, we have
MM=WΛW M^* M = W \Lambda W^*

where WW is a unitary matrix with nn orthonormal eigenvectors v1,v2,...,vnv_1, v_2, ..., v_n of MMM^* M, and Λ\Lambda is a diagonal matrix, where the kkth element λk\lambda_k on the diagonal is the eigenvalue of vkv_k. According to the first lemma, all the λk\lambda_k are nonnegative.

We can assume that λ1λ2...λn\lambda_1 \geq \lambda_2 \geq ... \geq \lambda_n. If this is not the case, we can simply permute the columns of VV and Σ\Sigma so that this condition holds. Since all the eigenvectors are orthogonal, and the rank of MMM^* M is rr by the second lemma, the eigenvalues λ1,λ2,...,λr\lambda_1, \lambda_2, ..., \lambda_r are positive, and λr+1,λr+2,λn\lambda_{r + 1}, \lambda_{r + 2}, \lambda_n are zero.

Now define
σk=λk\sigma_k = \sqrt{\lambda_k}

uk=Avkσk u_k = \frac{A v_k}{\sigma_k}

for
k=1,2,...,r k = 1, 2, ..., r

Let UCm×rU \in \mathbb{C}^{m \times r} be the matrix with u1,u2,...,uru_1, u_2, ..., u_r, let VCn×rV \in \mathbb{C}^{n \times r} be the matrix with v1,v2,...,vrv_1, v_2, ..., v_r as columns, and let ΣRr×r\Sigma \in \mathbb{R}^{r \times r} be the diagonal matrix with σ1,σ2,...,σr\sigma_1, \sigma_2, ..., \sigma_r on the diagonal.

For 1kr1 \leq k \leq r, we have that Vvk=ekV^* v_k = e_k. So Mvk=UΣVvk=Uσkek=σkUek=σkuk=AvkMv_k = U \Sigma V^* v_k = U \sigma_k e_k =\sigma_k U e_k = \sigma_k u_k = A v_k. On the other hand, if r<knr < k \leq n, we have Vvk=0V^* v_k = 0, so UΣVvk=0U \Sigma V^* v_k = 0. It follows that Mvk=UΣVvkM v_k = U \Sigma V^* v_k for k=1,2,...,nk = 1, 2, ..., n. Since the vectors v1,v2,...,vnv_1, v_2, ..., v_n are orthonormal, they form a basis basis of Cn\mathbb{C}^n. By linearity, it follows that Mv=UΣVvMv = U \Sigma V v for any vCnv \in \mathbb{C}^n. So we must have
M=UΣV M = U \Sigma V^*

It remains to show that Σ\Sigma is a diagonal matrix with positive diagonal elements, and that UU and VV are semi-unitary. The first fact is obvious since we have defined Σ\Sigma to have the values λ1,λ2,...λr\sqrt{\lambda_1}, \sqrt{\lambda_2}, ... \sqrt{\lambda_r} on the diagonal.

To show that UU is semi-unitary we need to show that UU=IU^* U = I. Consider that the elements UUU^* U are given by uiuj=(Avi)Avjσiσj=vi(AAvj)σiσj=λj<vi,vj>σiσju_i^* u_j = \frac{(Av_i)^* Av_j}{\sigma_i \sigma_j} = \frac{v_i^* (A^* A v_j)}{\sigma_i \sigma_j} = \lambda_j \frac{\left< v_i, v_j \right>}{\sigma_i \sigma_j} for 1i,jr1 \leq i, j \leq r. This expression reduces to 1 if i=ji = j and to 0 otherwise (by the orthogonality of the vectors v1,v2,...,vrv_1, v_2, ..., v_r. So UU=IrU^* U = I_r, which means that UU is semi-unitary.

VV is an n×rn \times r matrix with orthonormal columns. It follows that VV=IrV^* V = I_r, so VV is semi-unitary. \square

Lemma: MMM^* M has only real, nonnegative eigenvalues.

Proof: Let vv be any eigenvector of MMM^* M and λ\lambda be the corresponding eigenvalue, so that MMv=λvM^* M v = \lambda v. We have Mv2=vMMv=λvv=λv2|| M v ||^2 = v^* M^* M v = \lambda v^* v = \lambda || v ||^2. Rearranging for λ\lambda, we get λ=Mv2v2\lambda = \frac{|| M v ||^2}{||v||^2}, which is obviously real and nonnegative. \square

Lemma: Let rr be the rank of a complex matrix MM. Then MMM^* M has rank rr as well.

Proof: For a matrix MCm×nM \in \mathbb{C}^{m \times n} we have rank(M)=ndim(null(M))\text{rank}(M) = n - \text{dim}(\text{null}(M)), so it suffices to show that Mv=0    MMv=0Mv = 0 \iff M^* M v = 0, since this implies that the null spaces are the same.

From Mv=0Mv = 0 it follows that MMv=M0=0M^* M v = M^* 0 = 0. It is a property of the norm that v2=0||v||^2 = 0 implies v=0v = 0. So MMv=Mv2=0M^* M v = ||Mv||^2 = 0 implies v=0v = 0. \square

The theorem proves the existence of a variant of the singular value decomposition that is also known as the compact singular value decomposition. The existence of the ‘full’ singular value decomposition and other variants follows from the existence of the compact singular value decomposition, since we can just add extra zeroes to the diagonal, and add more orthonormal columns to UU and VV, without changing the product UΣVU \Sigma V^*.