Provable Subspace Clustering: When LRR meets SSCProvable Subspace Clustering: When LRR meets SSCWang, Yu-Xiang and Xu, Huan and Leng, Chenlei2013

Paper summarynipsreviewsThis paper proposes a new subspace clustering algorithm called Low Rank Sparse Subspace Clustering (LRSSC) and aims to study the conditions under which it is guaranteed to produce a correct clustering. The correctness is defined in terms of two properties. The self-expressiveness property (SEP) captures whether a data point is expressed as a linear combination of other points in the same subspace. The graph connectivity property (GCP) captures whether the points in one subspace form a connected component of the graph formed by all the data points. The LRSSC algorithm builds on two existing subspace clustering algorithms, SSC and LRR, which have complementary properties. The solution of LRR is guaranteed to satisfy the SEP under the strong assumption of independent subspaces and the GCP under weak assumptions (shown in this paper). On the other hand, the solution of SSC is guaranteed to satisfy the SEP under milder conditions, even with noisy data or data corrupted with outliers, but the solution of SSC need not satisfy the GCP. This paper combines the objective functions of both methods with the hope of obtaining a method that satisfies both SEP and GEP for some range of values of the relative weight between the two objective functions. Theorem 1 derives conditions under which LRSSC satisfies SEP in the deterministic case. These conditions are natural generalizations of existing conditions for SSC. But they are actually weaker than existing conditions. Theorem 2 derives conditions under which LRSSC satisfies SEP in the random case (data drawn at random from randomly drawn subspaces). Overall, it is shown that when the weight of the SSC term is large enough and the ratio of the data dimension to the subspace dimension grows with the log of the number of points, then LRSSC is guaranteed to satisfy SEP with high probability. I say high, because it does not tend to 1. Finally, Proposition 1 and Lemma 4 show that LRR satisfies GCP (presumably almost surely). Experiments support that for a range of the SSC weight, LRSCC works. Additional experiments on model selection show the usefulness of the analysis.

This paper proposes a new subspace clustering algorithm called Low Rank Sparse Subspace Clustering (LRSSC) and aims to study the conditions under which it is guaranteed to produce a correct clustering. The correctness is defined in terms of two properties. The self-expressiveness property (SEP) captures whether a data point is expressed as a linear combination of other points in the same subspace. The graph connectivity property (GCP) captures whether the points in one subspace form a connected component of the graph formed by all the data points. The LRSSC algorithm builds on two existing subspace clustering algorithms, SSC and LRR, which have complementary properties. The solution of LRR is guaranteed to satisfy the SEP under the strong assumption of independent subspaces and the GCP under weak assumptions (shown in this paper). On the other hand, the solution of SSC is guaranteed to satisfy the SEP under milder conditions, even with noisy data or data corrupted with outliers, but the solution of SSC need not satisfy the GCP. This paper combines the objective functions of both methods with the hope of obtaining a method that satisfies both SEP and GEP for some range of values of the relative weight between the two objective functions. Theorem 1 derives conditions under which LRSSC satisfies SEP in the deterministic case. These conditions are natural generalizations of existing conditions for SSC. But they are actually weaker than existing conditions. Theorem 2 derives conditions under which LRSSC satisfies SEP in the random case (data drawn at random from randomly drawn subspaces). Overall, it is shown that when the weight of the SSC term is large enough and the ratio of the data dimension to the subspace dimension grows with the log of the number of points, then LRSSC is guaranteed to satisfy SEP with high probability. I say high, because it does not tend to 1. Finally, Proposition 1 and Lemma 4 show that LRR satisfies GCP (presumably almost surely). Experiments support that for a range of the SSC weight, LRSCC works. Additional experiments on model selection show the usefulness of the analysis.