Universal adversarial perturbations on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

Universal adversarial perturbations
Seyed-Mohsen Moosavi-Dezfooli and Alhussein Fawzi and Omar Fawzi and Pascal Frossard
arXiv e-Print archive - 2016 via Local arXiv
Keywords: cs.CV, cs.AI, cs.LG, stat.ML
more

Summaries/Notes 1

[link] Summary by David Stutz 5 years ago

Moosavi-Dezfooli et al. propose universal adversarial perturbations – perturbations that are image-agnostic. Specifically, they extend the framework for crafting adversarial examples, i.e. by iteratively solving

$\arg\min_r \|r \|_2$ s.t. $f(x + r) \neq f(x)$.

Here, $r$ denotes the adversarial perturbation, $x$ a training sample and $f$ the neural network. Instead of solving this problem for a specific $x$, the authors propose to solve the problem over the full training set, i.e. in each iteration, a different sample $x$ is chosen, one step in the direction of the gradient is taken and the perturbation is updated accordingly. In experiments, they show that these universal perturbations are indeed able to fool networks an several images; in addition, these perturbations are – sometimes – transferable to other networks.

Also view this summary on [davidstutz.de](https://davidstutz.de/category/reading/).

Your comment: