An elementary method for the problem of column subset selection in a rectangular matrix
Abstract.
The problem of extracting a well conditioned submatrix from any rectangular matrix (with e.g. normalized columns) has been a subject of extensive research with applications to rank revealing factorization, low stretch spanning trees, sparse solutions to least squares regression problems, and is also connected with problems in functional and harmonic analysis. Here, we provide a deterministic algorithm which extracts a submatrix from any matrix with guaranteed individual lower and upper bounds on each singular value of . The proof of our main result is short and elementary.
keywords: Column subset selection, Restricted Invertibility,
1. Introduction
Let be a matrix such that all columns of have unit euclidean -norm. We denote by the -norm of a vector and by (resp. ) the associated operator norm (resp. the Hilbert-Schmidt norm). Let denote the submatrix of obtained by extracting the columns of indexed by . For any real symmetric matrix , let denote the -th eigenvalue of , and we order the eigenvalues as . We also write (resp. ) for the smallest (resp. largest) eigenvalue of . We finally write for the size of a set .
The problem of well conditioned column selection that we condider here consists in finding the largest subset of columns of such that the corresponding submatrix has all singular values in a prescribed interval . The one-sided problem of finding the largest possible such that is called the Restricted Invertibility Problem and has a long history starting with the seminal work of Bourgain and Tzafriri [1]. Applications of such results are well known in the domain of harmonic analysis [1]. The study of the condition number is also a subject of extensive study in statistics and signal processing [5].
Here, we propose an elementary approach to this problem based on two simple ingredients:
-
(1)
Choosing recursively , the set of remaining columns of , verifying
where is a relevant quantity depending on the previous chosen vectors;
-
(2)
a well-known equation (sometimes called secular equation) whose roots are the eigenvalues of a square matrix after appending a row and a line.
1.1. Historical background
Concerning the Restricted Invertibility problem, Bourgain and Tzafriri [1] obtained the following result for square matrices:
Theorem 1.1 ([1]).
Given a matrix whose columns have unit -norm, there exists with such that , where and are absolute constants.
See also [4] for a simpler proof. Vershynin [6] generalized Bourgain and Tzafriri’s result to the case of rectangular matrices and the estimate of was improved as follows.
Theorem 1.2 ([6]).
Given a matrix and letting be the matrix obtained from by -normalizing its columns. Then, for any , there exists with
such that .
Recently, Spielman and Srivastava proposed in [3] a deterministic construction of which allows them to obtain the following result.
Theorem 1.3 ([3]).
Let be a matrix and . Then there exists with such that .
The technique of proof relies on new constructions and inequalities which are thoroughly explained in the Bourbaki seminar of Naor [2]. Using these techniques, Youssef [7] improved Vershynin’s result as:
Theorem 1.4 ([7]).
Given a matrix and letting be the matrix obtained from by -normalizing its columns. Then, for any , there exists with such that .
1.2. Our contribution
We provide a deterministic algorithm that extracts a submatrix from the matrix with guaranteed individual lower and upper bounds on each singular value of .
Consider the set of vectors , where the are the columns of . At step , choose . By induction, let us be given at step . Let denote the matrix whose columns are and let be an unit eigenvector of associated to .
We say that satisfies the hypothesis (H) if verifies for :
(1.1) | |||||
(1.2) |
We now introduce the "potential" associated to satisfying (H):
We then choose so that
(1.3) |
The following result, for which we propose a short and elementary proof, gives a control on all singular values in the column selection problem.
Theorem 1.5.
Let satisfies Hypothesis (H). Set . Then, we can extract from some submatrices such that for all and with , we have
(1.4) |
where
(1.5) |
In particular,
2. Proof of Theorem 1.5
2.1. Suitable choice of the extracted vectors
Consider the set of vectors . At step , choose . By induction, let us be given at step . Let denote the matrix whose columns are and let be an unit eigenvector of associated to . Let us choose so that
(2.6) |
Lemma 2.1.
For all , verifies
Proof.
Let be the matrix whose columns are the , i.e. . Then
which yields the conclusion by plugging in into (2.6) since . ∎
2.2. Controlling the individual eigenvalues
It is clear that (1.4) holds for since then, 1 is the only singular value because the columns are supposed to be normalized.
Assume the induction hypothesis : for all with , (1.4) holds.
Let us then show that holds. By Cauchy interlacing theorem, we have
We then deduce, due to the induction hypothesis and Assumption (H),
(2.7) | |||||
(2.8) | |||||
It remains to obtain the upper estimate for and the lower one for . We write
(2.14) |
and it is well known that the eigenvalues of are the zeros of the secular equation:
(2.15) |
We first estimate which is the greatest zero of , and assume for contradiction that
(2.16) |
From , we then obtain that for ,
Let be the zero of . We have . But is decreasing, so
Thus, using Lemma 2.1, the equality (1.5) and noting that , we can write:
(2.17) |
which yields a contradiction with the inequality (2.16). Thus, we have
(2.18) |
This shows that the upper bound in holds.
Finally, to estimate which is the smallest zero of , we write
By means of the same reasonning as above, we show that the lower bound in holds.
2.3. Controlling the greatest eigenvalue
Set .
Since , we can write
Hence, using that implies , we reach the upper estimate for .
This concludes the proof of Theorem 1.5.
3. Two simple examples and an open question
Let us choose . Using and , we thus deduce that verifies Hypothesis (H). Applying Theorem 1.5, we obtain that we can extract a submatrix with columns and , provided that
which is a slightly weaker bound than the one known from [1].
One can also verify that satisfies Hypothesis (H) and yields a similar bound.
An open question is then to know whether there exists a function satisfying Hypothesis (H) and allowing to reach the optimal bound known in the Bourgain Tzafriri theorem [1] via our new algorithm.
References
- [1] Bourgain, J. and Tzafriri, L., Invertibility of "large” submatrices with applications to the geometry of Banach spaces and harmonic analysis. Israel J. Math. 57 (1987), no. 2, 137–224.
- [2] Naor, A., Sparse quadratic forms and their geometric applications [following Batson, Spielman and Srivastava]. Séminaire Bourbaki: Vol. 2010/2011. Exposés 1027–1042. Astérisque No. 348 (2012), Exp. No. 1033, viii, 189–217.
- [3] Spielman, D. A. and Srivastava, N., An elementary proof of the restricted invertibility theorem. Israel J. Math. 190 (2012), 83–91.
- [4] Tropp, J., The random paving property for uniformly bounded matrices, Studia Math., vol. 185, no. 1, pp. 67–82, 2008.
- [5] Tropp, J., Norms of random submatrices and sparse approximation. C. R. Acad. Sci. Paris, Ser. I (2008), Vol. 346, pp. 1271-1274.
- [6] Vershynin, R., John’s decompositions: selecting a large part. Israel J. Math. 122 (2001), 253–277.
- [7] Youssef, P. A note on column subset selection. Int. Math. Res. Not. IMRN 2014, no. 23, 6431–6447.