In this paper they prior the representation a logistic regression model using known proteinprotein interactions. They do so by regularizing the weights of the model using the Laplacian encoding of a graph. Here is a regularization term of this form: $$\lambda w_1 + \eta w^T L w,$$ #### A small example: Given a small graph of three nodes A, B, and C with one edge: {AB} we have the following Laplacian: $$ L = D  A = \left[\array{ 1 & 0 & 0 \\ 0 & 1 & 0\\ 0 & 0 & 0}\right]  \left[\array{ 0 & 1 & 0 \\ 1 & 0 & 0\\ 0 & 0 & 0}\right]$$ $$L = \left[\array{ 1 & 1 & 0 \\ 1 & 1 & 0\\ 0 & 0 & 0}\right] $$ If we have a small linear regression of the form: $$y = x_Aw_A + x_Bw_B + x_Cw_C$$ Then we can look at how $w^TLw$ will impact the weights to gain insight: $$w^TLw $$ $$= \left[\array{ w_A & w_B & w_C}\right] \left[\array{ 1 & 1 & 0 \\ 1 & 1 & 0\\ 0 & 0 & 0}\right] \left[\array{ w_A \\ w_B \\ w_C}\right] $$ $$= \left[\array{ w_A & w_B & w_C}\right] \left[\array{ w_A w_B \\ w_A + w_B \\ 0}\right] $$ $$ = (w_A^2 w_Aw_B ) + (w_Aw_B + w_B^2) $$ So because all terms are squared we can remove them from consideration to look at what is the real impact of regularization. $$ = (w_Aw_B ) + (w_Aw_B) $$ $$ = 2w_Aw_B$$ The Laplacian regularization seems to increase the weight values of edges which are connected. Along with the squared terms and the $L1$ penalty that is also used the weights cannot grow without bound. #### A few more experiments: If we perform the same computation for a graph with two edges: {AB, BC} we have the following term which increases the weights of both pairwise interactions: $$ = 2w_Aw_B 2w_Bw_C$$ If we perform the same computation for a graph with two edges: {AB, AC} we have no surprises: $$ = 2w_Aw_B 2w_Aw_C$$ Another thing to think about is if there are no edges. If by default there are selfloops then the degree matrix will have 1 on the diagonal and it will be the identity which will be an $L2$ term. If no self loops are defined then the result is a 0 matrix yielding no regularization at all. #### Contribution: A contribution of this paper is to use the absolute value of the weights to make training easier. $$w^T L w$$ TODO: Add more about how this impacts learning. #### Overview Here a high level figure shows the data and targets together with a graph prior. It looks nice so I wanted to include it. https://i.imgur.com/rnGtHqe.png
Your comment:
