Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
Paper summary Athalye et al. propose methods to circumvent different types of defenses against adversarial example based on obfuscated gradients. In particular, they identify three types of obfuscated gradients: shattered gradients (e.g., caused by undifferentiable parts of a network or through numerical instability), stochastic gradients, and exploding and vanishing gradients. These phenomena all influence the effectiveness of gradient-based attacks. Athalye et al. Give several indicators of how to find out when obfuscated gradients occur. Personally, I find most of these points straight forward, but it is still beneficial to write these “debug strategies” down. The main contribution, however, is a comprehensive evaluation of all eight ICLR’18 defenses against state-of-the-art attacks. As all (except adversarial training) cause obfuscated gradients, Athalye et al. Discuss several strategies to “un-obfuscate” the gradients to successfully compute adversarial examples. Overall, they show that seven out of eight defenses are not reliable, only adversarial training with projected gradient descent can withstand attacks limited to $\epsilon\approx 0.3$. Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).
arxiv.org
scholar.google.com
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
Athalye, Anish and Carlini, Nicholas and Wagner, David A.
arXiv e-Print archive - 2018 via Local Bibsonomy
Keywords: dblp


[link]
Summary by David Stutz 5 months ago
Loading...
Your comment:


ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About

Sponsored by: and