Defensive Distillation is Not Robust to Adversarial Examples Defensive Distillation is Not Robust to Adversarial Examples
Paper summary Carlini and Wagner show that defensive distillation as defense against adversarial examples does not work. Specifically, they show that the attack by Papernot et al [1] can easily be modified to attack distilled networks. Interestingly, the main change is to introduce a temperature in the last softmax layer. This termperature, when chosen hgih enough will take care of aligning the gradients from the softmax layer and from the logit layer – otherwise, they will have significantly different magnitude. Personally, I found that this also aligns with the observations in [2] where Carlini and Wagner also find that attack objectives defined on the logits work considerably better. [1] N. Papernot, P. McDaniel, X. Wu, S. Jha, A. Swami. Distillation as a defense to adersarial perturbations against deep neural networks. SP, 2016. [2] N. Carlini, D. Wagner. Towards Evaluating the Robustness of Neural Networks. ArXiv, 2016. Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).
arxiv.org
scholar.google.com
Defensive Distillation is Not Robust to Adversarial Examples
Carlini, Nicholas and Wagner, David A.
arXiv e-Print archive - 2016 via Local Bibsonomy
Keywords: dblp




ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About

Sponsored by: and