The singular value decomposition

In this article, I will prove that every matrix has a singular value decomposition. The singular value decomposition has numerous applications. It can be used to compute the pseudoinverse of a matrix, to perform principal component analysis, and it can be used to approximate a matrix $M$ by a low-rank approximation $\tilde{M}$ .

Theorem: Any matrix $M \in C^{m \times n}$ of rank $r$ has a decomposition

M = U Σ V^{*}

where $U \in C^{m \times r}$ and $V \in C^{n \times r}$ are semi-unitary matrices, and $Σ \in R^{r \times r}$ is a diagonal matrix with positive elements on the diagonal.

Proof: If $m > n$ , we can look at the singular value decomposition $M^{*} = U Σ V^{*}$ of the conjugate transpose $M^{*}$ of $M$ . Using $M = (M^{*})^{*} = (U Σ V^{*})^{*} = V Σ U^{*}$ , we find the singular value decomposition of $M$ . So we can assume $m \leq n$ without loss of generality.

Now consider the $n \times n$ matrix $M^{*} M$ , which is Hermitian. Using the spectral theorem for Hermitian matrices, we have

M^{*} M = W Λ W^{*}

where $W$ is a unitary matrix with $n$ orthonormal eigenvectors $v_{1}, v_{2}, ..., v_{n}$ of $M^{*} M$ , and $Λ$ is a diagonal matrix, where the $k$ th element $λ_{k}$ on the diagonal is the eigenvalue of $v_{k}$ . According to the first lemma, all the $λ_{k}$ are nonnegative.

We can assume that $λ_{1} \geq λ_{2} \geq ... \geq λ_{n}$ . If this is not the case, we can simply permute the columns of $V$ and $Σ$ so that this condition holds. Since all the eigenvectors are orthogonal, and the rank of $M^{*} M$ is $r$ by the second lemma, the eigenvalues $λ_{1}, λ_{2}, ..., λ_{r}$ are positive, and $λ_{r + 1}, λ_{r + 2}, λ_{n}$ are zero.

Now define

σ_{k} = λ_{k}

u_{k} = \frac{A v _{k}}{σ _{k}}

for

k = 1, 2, ..., r

Let $U \in C^{m \times r}$ be the matrix with $u_{1}, u_{2}, ..., u_{r}$ , let $V \in C^{n \times r}$ be the matrix with $v_{1}, v_{2}, ..., v_{r}$ as columns, and let $Σ \in R^{r \times r}$ be the diagonal matrix with $σ_{1}, σ_{2}, ..., σ_{r}$ on the diagonal.

For $1 \leq k \leq r$ , we have that $V^{*} v_{k} = e_{k}$ . So $M v_{k} = U Σ V^{*} v_{k} = U σ_{k} e_{k} = σ_{k} U e_{k} = σ_{k} u_{k} = A v_{k}$ . On the other hand, if $r < k \leq n$ , we have $V^{*} v_{k} = 0$ , so $U Σ V^{*} v_{k} = 0$ . It follows that $M v_{k} = U Σ V^{*} v_{k}$ for $k = 1, 2, ..., n$ . Since the vectors $v_{1}, v_{2}, ..., v_{n}$ are orthonormal, they form a basis basis of $C^{n}$ . By linearity, it follows that $M v = U Σ V v$ for any $v \in C^{n}$ . So we must have

M = U Σ V^{*}

It remains to show that $Σ$ is a diagonal matrix with positive diagonal elements, and that $U$ and $V$ are semi-unitary. The first fact is obvious since we have defined $Σ$ to have the values $λ_{1}, λ_{2}, ... λ_{r}$ on the diagonal.

To show that $U$ is semi-unitary we need to show that $U^{*} U = I$ . Consider that the elements $U^{*} U$ are given by $u_{i}^{*} u_{j} = \frac{( A v _{i} ) ^{*} A v _{j}}{σ _{i} σ _{j}} = \frac{v _{i}^{*} ( A ^{*} A v _{j} )}{σ _{i} σ _{j}} = λ_{j} \frac{⟨ v _{i} , v _{j} ⟩}{σ _{i} σ _{j}}$ for $1 \leq i, j \leq r$ . This expression reduces to 1 if $i = j$ and to 0 otherwise (by the orthogonality of the vectors $v_{1}, v_{2}, ..., v_{r}$ . So $U^{*} U = I_{r}$ , which means that $U$ is semi-unitary.

$V$ is an $n \times r$ matrix with orthonormal columns. It follows that $V^{*} V = I_{r}$ , so $V$ is semi-unitary. $□$

Lemma: $M^ M$ has only real, nonnegative eigenvalues.*

Proof: Let $v$ be any eigenvector of $M^{*} M$ and $λ$ be the corresponding eigenvalue, so that $M^{*} M v = λ v$ . We have $∣∣ M v ∣ ∣^{2} = v^{*} M^{*} M v = λ v^{*} v = λ ∣∣ v ∣ ∣^{2}$ . Rearranging for $λ$ , we get $λ = \frac{∣∣ M v ∣ ∣ ^{2}}{∣∣ v ∣ ∣ ^{2}}$ , which is obviously real and nonnegative. $□$

Lemma: Let $r$ be the rank of a complex matrix $M$ . Then $M^ M $ha sr ank$ r$ as well.*

Proof: For a matrix $M \in C^{m \times n}$ we have $rank (M) = n - dim (null (M))$ , so it suffices to show that $M v = 0 ⟺ M^{*} M v = 0$ , since this implies that the null spaces are the same.

From $M v = 0$ it follows that $M^{*} M v = M^{*} 0 = 0$ . It is a property of the norm that $∣∣ v ∣ ∣^{2} = 0$ implies $v = 0$ . So $M^{*} M v = ∣∣ M v ∣ ∣^{2} = 0$ implies $v = 0$ . $□$

The theorem proves the existence of a variant of the singular value decomposition that is also known as the compact singular value decomposition. The existence of the 'full' singular value decomposition and other variants follows from the existence of the compact singular value decomposition, since we can just add extra zeroes to the diagonal, and add more orthonormal columns to $U$ and $V$ , without changing the product $U Σ V^{*}$ .