The Randomized Dependence Coefficient on ShortScience.org

papers.nips.cc
scholar.google.com

The Randomized Dependence Coefficient
López-Paz, David and Hennig, Philipp and Schölkopf, Bernhard
Neural Information Processing Systems Conference - 2013 via Local Bibsonomy
Keywords: dblp

Summaries/Notes 1

[link] Summary by NIPS Conference Reviews 8 years ago

The authors propose a non-linear measure of dependence between two random variables. This turns out to be the canonical correlation between random, nonlinear projections of the variables after a copula transformation which renders the marginals of the r.vs invariant to linear transformations. 

The paper introduces a new method called RDC to measure the statistical dependence between random variables. It combines a copula transform to a variant of kernel CCA using random projections, resulting in a $O(n log n)$ complexity. Results on synthetic and real benchmark data show promising results for feature selection. 

The RDC is a non-linear dependency estimator that satisfies Renyi's criteria and exploits the very recent FastFood speedup trick (ICML13) \cite{journals/corr/LeSS14}. This is a straightforward recipe: 1) copularize the data, effectively preserving the dependency structure while ignoring the marginals, 2) sample k non-linear features of each datum (inspired from Bochner's theorem) and 3) solve the regular CCA eigenvalue problem on the resulting paired datasets. Ultimately, RDC feels like a copularised variation of kCCA (misleading as this may sound). Its efficiency is illustrated successfully on a set of classical non-linear bivariate dependency scenarios and 12 real datasets via a forward feature selection procedure.

Your comment: