Learning Important Features Through Propagating Activation Differences Learning Important Features Through Propagating Activation Differences
Paper summary ### Summary The motivation of this paper is to make neural network interpretable so that they can be adopted in fields where interpretability is essential (ie: medical field). Thus, this paper present _DeepLIFT_, a method to interpret neural networks by **decomposing** the output prediction given a specific input by backpropagating the _contribution_ of all the neurons to every input features. The _contribution_ of a neuron is determined by comparing the activation of this neuron to a _reference activation_. This _reference activation_ is determined arbitrarily by a domain expert. Moreover, the authors argue that in some case, giving separate consideration to positive and negative contributions can reveal dependencies that are missed by other approaches. The authors show that their approaches can capture some dependencies that a gradient-based method cannot. ### Computing the contribution of a neuron Given the following notation: * $t$: Target output neuron * $t^0$: Reference activation of $t$ * $x_1, x_2, ..., x_n$: Set of neurons * $\Delta t$: The difference-from-reference of a target * $\Delta x$: The difference-from-reference of an input * $C_{\Delta x_i,\Delta t}$: Contributions scores of a neuron $$\Delta t = t - t^0$$ $$\Delta t = \sum_{i=1}^n C_{\Delta x_i \Delta t}$$ The advantage of the _difference from reference_ against purely gradient method is that the _diference from reference_ avoid all discontinuities as seen in the following figure https://i.imgur.com/vLZytJT.png ### "Backpropagating" the contribution to the input To compute the contribution to the input, the authors use a concept similar to the chain rule. Given a _multiplier_ $m_\Delta x _\Delta t$ computed as following: $$m_{\Delta x \Delta t} = \frac{C_{\Delta x \Delta t}} {\Delta x}$$ Given $z$ the output of a neuron, $y_j$ one neuron in the hidden layer before $z$ and $x_i$ one neuron at the input, before $y_j$. We can compute $m_{\Delta x_i \Delta z}$ as following: $$m_{\Delta x_i \Delta z}=\sum_j m_{\Delta x_i \Delta y_j} m_{\Delta y_j \Delta z}$$ ### Computing the contribution score The authors argues that it can be beneficial in some case to separate the positive and negative contributions. ie: $$\Delta _{x_i} = \Delta _{x_i}^+ + \Delta _{x_i}^-$$ $$C_{\Delta _{x_i} \Delta _t} = C_{\Delta _{x_i}^+ \Delta _t} + C_{\Delta _{x_i}^- \Delta _t}$$ The authors propose three similars techniques to compute the contribution score 1. A linear rule where one does not take into consideration the nonlinearity function such that $C_{\Delta _{x_i} \Delta _t} = w_i \Delta _{x_i}$ 2. The _rescale rule_ applied to nonlinear function (ie: $y=f(x)$). If $\Delta _y = 0$ or is very close (less than $10^{-7}$), then the authors use the gradient instead of the multiplier. 3. The _Reveal Cancel rule_ is similar than the _rescale rule_, but threat the positive and negative example differently. This allows to capture dependencies (ie: min/AND) that cannot be captured by _rescale rule_ or other method. The difference from reference can be computed as follow: $$\Delta y^+ = \frac{1}{2}(f(x^0 + \Delta x^+) - f(x^0)) + \frac{1}{2}(f(x^0 + \Delta x^+ + \Delta x^-) - f(x^0+ \Delta x^-)$$ $$\Delta y^- = \frac{1}{2}(f(x^0 + \Delta x^-) - f(x^0)) + \frac{1}{2}(f(x^0 + \Delta x^+ + \Delta x^-) - f(x^0+ \Delta x^+)$$
Learning Important Features Through Propagating Activation Differences
Shrikumar, Avanti and Greenside, Peyton and Kundaje, Anshul
International Conference on Machine Learning - 2017 via Local Bibsonomy
Keywords: dblp

How does a domain expert determine a reference activation?

Your comment:

ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!