Detecting Adversarial Samples from Artifacts Detecting Adversarial Samples from Artifacts
Paper summary Feinman et al. use dropout to compute an uncertainty measure that helps to identify adversarial examples. Their so-called Bayesian Neural Network Uncertainty is computed as follows: $\frac{1}{T} \sum_{i=1}^T \hat{y}_i^T \hat{y}_i - \left(\sum_{i=1}^T \hat{y}_i\right)\left(\sum_{i=1}^T \hat{y}_i\right)$ where $\{\hat{y}_1,\ldots,\hat{y}_T\}$ is a set of stochastic predictions (i.e. predictions with different noise patterns in the dropout layers). Here, is can easily be seen that this measure corresponds to a variance computatin where the first term is correlation and the second term is the product of expectations. In Figure 1, the authors illustrate the distributions of this uncertainty measure for regular training samples, adversarial samples and noisy samples for two attacks (BIM and JSMA, see paper for details). https://i.imgur.com/kTWTHb5.png Figure 1: Uncertainty distributions for two attacks (BIM and JSMA, see paper for details) and normal samples, adversarial samples and noisy samples. Also see this summary at [davidstutz.de](https://davidstutz.de/category/reading/).
arxiv.org
arxiv-sanity.com
scholar.google.com
Detecting Adversarial Samples from Artifacts
Reuben Feinman and Ryan R. Curtin and Saurabh Shintre and Andrew B. Gardner
arXiv e-Print archive - 2017 via Local arXiv
Keywords: stat.ML, cs.LG

more

Summary by David Stutz 5 months ago
Loading...
Your comment:


ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About

Sponsored by: and