Simple Black-Box Adversarial Perturbations for Deep Networks on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

Simple Black-Box Adversarial Perturbations for Deep Networks
Nina Narodytska and Shiva Prasad Kasiviswanathan
arXiv e-Print archive - 2016 via Local arXiv
Keywords: cs.LG, cs.CR, stat.ML
more

Summaries/Notes 2

[link] Summary by David Stutz 5 years ago

Narodytska and Kasiviswanathan propose a local search-based black.box adversarial attack against deep networks. In particular, they address the problem of k-misclassification defined as follows:

Definition (k-msiclassification). A neural network k-misclassifies an image if the true label is not among the k likeliest labels.

To this end, they propose a local search algorithm which, in each round, randomly perturbs individual pixels in a local search area around the last perturbation. If a perturbed image satisfies the k-misclassificaiton condition, it is returned as adversarial perturbation. While the approach is very simple, it is applicable to black-box models where gradients and or internal representations are not accessible but only the final score/probability is available. Still the approach seems to be quite inefficient, taking up to one or more seconds to generate an adversarial example. Unfortunately, the authors do not discuss qualitative results and do not give examples of multiple adversarial examples (except for the four in Figure 1).

https://i.imgur.com/RAjYlaQ.png
Figure 1: Examples of adversarial attacks. Top: original image, bottom: perturbed image.

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private