Summary by sam 8 months ago
### Summary
The motivation of this paper is to make neural network interpretable so that they can be adopted in fields where interpretability is essential (ie: medical field). Thus, this paper present _DeepLIFT_, a method to interpret neural networks by **decomposing** the output prediction given a specific input by backpropagating the _contribution_ of all the neurons to every input features. The _contribution_ of a neuron is determined by comparing the activation of this neuron to a _reference activation_. This _reference activation_ is determined arbitrarily by a domain expert. Moreover, the authors argue that in some case, giving separate consideration to positive and negative contributions can reveal dependencies that are missed by other approaches. The authors show that their approaches can capture some dependencies that a gradient-based method cannot.
### Computing the contribution of a neuron
Given the following notation:
* $t$: Target output neuron
* $t^0$: Reference activation of $t$
* $x_1, x_2, ..., x_n$: Set of neurons
* $\Delta t$: The difference-from-reference of a target
* $\Delta x$: The difference-from-reference of an input
* $C_{\Delta x_i,\Delta t}$: Contributions scores of a neuron
$$\Delta t = t - t^0$$
$$\Delta t = \sum_{i=1}^n C_{\Delta x_i \Delta t}$$
The advantage of the _difference from reference_ against purely gradient method is that the _diference from reference_ avoid all discontinuities as seen in the following figure
https://i.imgur.com/vLZytJT.png
### "Backpropagating" the contribution to the input
To compute the contribution to the input, the authors use a concept similar to the chain rule. Given a _multiplier_ $m_\Delta x _\Delta t$ computed as following:
$$m_{\Delta x \Delta t} = \frac{C_{\Delta x \Delta t}} {\Delta x}$$
Given $z$ the output of a neuron, $y_j$ one neuron in the hidden layer before $z$ and $x_i$ one neuron at the input, before $y_j$. We can compute $m_{\Delta x_i \Delta z}$ as following:
$$m_{\Delta x_i \Delta z}=\sum_j m_{\Delta x_i \Delta y_j} m_{\Delta y_j \Delta z}$$
### Computing the contribution score
The authors argues that it can be beneficial in some case to separate the positive and negative contributions. ie:
$$\Delta _{x_i} = \Delta _{x_i}^+ + \Delta _{x_i}^-$$
$$C_{\Delta _{x_i} \Delta _t} = C_{\Delta _{x_i}^+ \Delta _t} + C_{\Delta _{x_i}^- \Delta _t}$$
The authors propose three similars techniques to compute the contribution score
1. A linear rule where one does not take into consideration the nonlinearity function such that $C_{\Delta _{x_i} \Delta _t} = w_i \Delta _{x_i}$
2. The _rescale rule_ applied to nonlinear function (ie: $y=f(x)$). If $\Delta _y = 0$ or is very close (less than $10^{-7}$), then the authors use the gradient instead of the multiplier.
3. The _Reveal Cancel rule_ is similar than the _rescale rule_, but threat the positive and negative example differently. This allows to capture dependencies (ie: min/AND) that cannot be captured by _rescale rule_ or other method. The difference from reference can be computed as follow:
$$\Delta y^+ = \frac{1}{2}(f(x^0 + \Delta x^+) - f(x^0)) + \frac{1}{2}(f(x^0 + \Delta x^+ + \Delta x^-) - f(x^0+ \Delta x^-)$$
$$\Delta y^- = \frac{1}{2}(f(x^0 + \Delta x^-) - f(x^0)) + \frac{1}{2}(f(x^0 + \Delta x^+ + \Delta x^-) - f(x^0+ \Delta x^+)$$

more
less