[link]
Summary by David Stutz 2 months ago
Schott et al. propose an analysis-by-synthetis approach for adversarially robust MNIST classification. In particular, as illustrated in Figure 1, class-conditional variational auto-encoders (i.e., one variational auto-encoder per class) are learned. The respective recognition models, i.e., encoders, are discarded. For classification, the optimization problem
$l_y^*(x) = \max_z \log p(x|z) - \text{KL}(\mathcal{N}(z, \sigma I)|\mathcal{N}(0,1))$
is solved for each class $z$. Here, $p(x|z)$ represents the learned generative model. The optimization problem leads a latent code $z$ corresponding to the best reconstruction of the input. The corresponding likelihood can be used for classificaiton using Bayes’ theorem. The obtained posteriors $p(y|x)$ are then scaled using a modified softmax (see paper) to obtain the final decision. (Additionally, input binarization is used as defense.)
https://i.imgur.com/ignvoHQ.png
Figure 1: The proposed analysis by synthesis approach to MNIST classification. The depicted generators are taken from class-specific variational auto-encoders.
In addition to the proposed defense, Schott et al. also derive lower and upper bounds on the robustness of the classification procedure. These bounds can be derived from the optimization problem above, see the paper for details.
In experiments, they show that their defense outperforms state-of-the-art adversarial training and allows to estimate tight bounds. In addition, the method is robust against distal adversarial examples and the adversarial examples look more meaningful, see Figure 2.
https://i.imgur.com/uxGzzg1.png
Figure 2: Adversarial examples for the proposed “ABS” method, its binary variant and related work.
Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).

more
less