A Theoretical Framework for Robustness of (Deep) Classifiers against Adversarial Examples A Theoretical Framework for Robustness of (Deep) Classifiers against Adversarial Examples
Paper summary Wang et al. discuss an alternative definition of adversarial examples, taking into account an oracle classifier. Adversarial perturbations are usually constrained in their norm (e.g., $L_\infty$ norm for images); however, the main goal of this constraint is to ensure label invariance – if the image didn’t change notable, the label didn’t change either. As alternative formulation, the authors consider an oracle for the task, e.g., humans for image classification tasks. Then, an adversarial example is defined as a slightly perturbed input, whose predicted label changes, but where the true label (i.e., the oracle’s label) does not change. Additionally, the perturbation can be constrained in some norm; specifically, the perturbation can be constrained on the true manifold of the data, as represented by the oracle classifier. Based on this notion of adversarial examples, Wang et al. argue that deep neural networks are not robust as they utilize over-complete feature representations. Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).
arxiv.org
arxiv-sanity.com
scholar.google.com
A Theoretical Framework for Robustness of (Deep) Classifiers against Adversarial Examples
Beilun Wang and Ji Gao and Yanjun Qi
arXiv e-Print archive - 2016 via Local arXiv
Keywords: cs.LG, cs.CR, cs.CV

more

[link]
Summary by David Stutz 1 month ago
Loading...
Your comment:


ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About

Sponsored by: and