Network-regularized Sparse Logistic Regression Models for Clinical Risk Prediction and Biomarker Discovery Network-regularized Sparse Logistic Regression Models for Clinical Risk Prediction and Biomarker Discovery
Paper summary In this paper they prior the representation a logistic regression model using known protein-protein interactions. They do so by regularizing the weights of the model using the Laplacian encoding of a graph. Here is a regularization term of this form: $$\lambda ||w||_1 + \eta w^T L w,$$ #### A small example: Given a small graph of three nodes A, B, and C with one edge: {A-B} we have the following Laplacian: $$ L = D - A = \left[\array{ 1 & 0 & 0 \\ 0 & 1 & 0\\ 0 & 0 & 0}\right] - \left[\array{ 0 & 1 & 0 \\ 1 & 0 & 0\\ 0 & 0 & 0}\right]$$ $$L = \left[\array{ 1 & -1 & 0 \\ -1 & 1 & 0\\ 0 & 0 & 0}\right] $$ If we have a small linear regression of the form: $$y = x_Aw_A + x_Bw_B + x_Cw_C$$ Then we can look at how $w^TLw$ will impact the weights to gain insight: $$w^TLw $$ $$= \left[\array{ w_A & w_B & w_C}\right] \left[\array{ 1 & -1 & 0 \\ -1 & 1 & 0\\ 0 & 0 & 0}\right] \left[\array{ w_A \\ w_B \\ w_C}\right] $$ $$= \left[\array{ w_A & w_B & w_C}\right] \left[\array{ w_A -w_B \\ -w_A + w_B \\ 0}\right] $$ $$ = (w_A^2 -w_Aw_B ) + (-w_Aw_B + w_B^2) $$ So because all terms are squared we can remove them from consideration to look at what is the real impact of regularization. $$ = (-w_Aw_B ) + (-w_Aw_B) $$ $$ = -2w_Aw_B$$ The Laplacian regularization seems to increase the weight values of edges which are connected. Along with the squared terms and the $L1$ penalty that is also used the weights cannot grow without bound. #### A few more experiments: If we perform the same computation for a graph with two edges: {A-B, B-C} we have the following term which increases the weights of both pairwise interactions: $$ = -2w_Aw_B -2w_Bw_C$$ If we perform the same computation for a graph with two edges: {A-B, A-C} we have no surprises: $$ = -2w_Aw_B -2w_Aw_C$$ Another thing to think about is if there are no edges. If by default there are self-loops then the degree matrix will have 1 on the diagonal and it will be the identity which will be an $L2$ term. If no self loops are defined then the result is a 0 matrix yielding no regularization at all. #### Contribution: A contribution of this paper is to use the absolute value of the weights to make training easier. $$|w|^T L |w|$$ TODO: Add more about how this impacts learning. #### Overview Here a high level figure shows the data and targets together with a graph prior. It looks nice so I wanted to include it. https://i.imgur.com/rnGtHqe.png
arxiv.org
scholar.google.com
Network-regularized Sparse Logistic Regression Models for Clinical Risk Prediction and Biomarker Discovery
Wenwen Min and Juan Liu and Shihua Zhang
arXiv e-Print archive - 2016 via Local arXiv
Keywords: q-bio.GN, cs.LG, stat.ML, J.3; H.2.8; G.1.6; I.5

more

Loading...
Your comment:


ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About