Summary by David Stutz 6 months ago
Tanay and Griffin introduce the boundary tilting perspective as alternative to the “linear explanation” for adversarial examples. Specifically, they argue that it is not reasonable to assume that the linearity in deep neural networks causes the existence of adversarial examples. Originally, Goodfellow et al. [1] explained the impact of adversarial examples by considering a linear classifier:
$w^T x' = w^Tx + w^T\eta$
where $\eta$ is the adversarial perturbations. In large dimensions, the second term might result in a significant shift of the neuron's activation. Tanay and Griffin, in contrast, argue that the dimensionality does not have an impact; althought he impact of $w^T\eta$ grows with the dimensionality, so does $w^Tx$, such that the ratio should be preserved. Additionally, they showed (by giving a counter-example) that linearity is not sufficient for the existence of adversarial examples.
Instead, they offer a different perspective on the existence of adversarial examples that is, in the course of the paper, formalized. Their main idea is that the training samples live on a manifold in the actual input space. The claim is, that on the manifold there are no adversarial examples (meaning that the classes are well separated on the manifold and it is hard to find adversarial examples for most training samples). However, the decision boundary extends beyond the manifold and might lie close to the manifold such that adversarial examples leaving the manifold can be found easily. This idea is illustrated in Figure 1.
https://i.imgur.com/SrviKgm.png
Figure 1: Illustration of the underlying idea of the boundary tilting perspective, see the text for details.
[1] Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy:
Explaining and Harnessing Adversarial Examples. CoRR abs/1412.6572 (2014)
Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).

more
less