The Singular Value Decomposition
The Action of a Symmetric Matrix
First of all, it is useful to understand the action of a symmetric matrix in a linear transformation, . We know that a symmetric matrix can be represented as
where is an orthogonal matrix (that is, a matrix whose columns are orthogonal to each other, and of unit length). The columns of are (orthogonal) eigenvectors of , and is the diagonal matrix of eigenvalues. Thus, when a vector is multiplied by ,
the geometric result can be visualized using the unit ball (of unit radius) in . It is transformed into an ellipsoid, with the eigenvectors representing the axes of the ellipsoid, and the eigenvalues providing the scaling of each axis of the ellipsoid. We can also think of how this transformation comes about as a succession of three steps:
- Multiplication by , by the transpose of the orthogonal matrix , corresponds to a rotation of the space . It rotates each of the eigenvectors into the standard vector .
- Multiplication by represents a scaling of each of the now "standardized" eigenvectors: each is stretched by the appropriate factor, the eigenvalue;
- Multiplication of the result by rotates the scaled eigenvectors back into their original positions.
In conclusion, you can think of the "action" of a symmetric matrix as a product of these three simple actions on : a rotation, a scaling, and a rotation.
The SVD Theorem
It turns out that this aspect of symmetric matrices is true in general: every matrix can be thought of as the product of an orthogonal matrix, a diagonal matrix, and the transpose of an orthogonal matrix.
Every matrix is related to two important symmetric matrices: and . Since each of these two matrices is symmetric, each can be represented as a product
Furthermore, the diagonal matrices and have positive entries on the diagonal. We can see that from the spectral decomposition of and . For example,
, we know that .
Theorem: Let be an matrix with real components. Then where and are as defined above, and is the matrix with positive () entries such that and .
We can easily show that, with , the formulas for and work out:
Notice that the orthogonal diagonalization of a symmetric matrix is the singular value decomposition in that case.
What follows is an illustration of the SVD, as diagrammed by Cliff Long and Tom Hern in the case of . In this example, the notation is and .
You see that the action of a general matrix can also be viewed as a rotation, followed by a scaling, followed by a rotation. If is not square, or not full rank, then there will be a null space, and some of the dimensions will be annihilated.
The SVD for image processing, image analysis, and statistical analysis
If is rank , then can be written as a spectral decomposition, too, in terms of the eigenvectors and the eigenvalues of and . The eigenvectors of and are called the singular vectors of :
and where, by convention, for . The values are called the singular values of the matrix. This is crucial for image processing: often an image is not full rank, so that there are few products; furthermore, it may be that an image contains noise, and the noise tends to be high frequency and of little total weight. It may be that it "hangs out" on the last few singular vector pairs. We simply drop those from the recomposition of the matrix , and we will have de-noised the image.
On a similar note, the information contained in the last singular vector pairs may not be very important to the recomposition of the image, so that we can drop them without much loss of information. This represents compression of the image.
- The Col() is given by the columns of corresponding to non-zero singular values.
- The Row() is given by the columns of corresponding to non-zero singular values.
- The Nul() is given by the columns of corresponding to zero singular values; and of course
- The Nul() is given by the columns of corresponding to zero singular values.
- We can use the SVD to interpolate a matrix. See Long, A. and C. Long. Surface Approximation and Interpolation Via Matrix SVD. The College Mathematics Journal, Vol. 32, #1, January, 2001:20-25.
- When we can't invert a matrix, we can define a "pseudo-inverse" via the SVD. This is precisely what we do when we solve the linear regression/least squares problem in general: we can't solve because the system is inconsistent; but we can find a least-squares solution, by inverting (when possible). When it's not possible, we use a pseudo-inverse solution.
- The SVD is used for image compression, as you can see by visiting this page.