ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models
Paper summary Chen et al. propose a gradient-based black-box attack to compute adversarial examples. Specifically, they follow the general idea of [1] where the following objective is optimized: $\min_x \|x – x_0\|_2 + c \max\{\max_{i\neq t}\{z_i\} – z_t, - \kappa\}$. Here, $x$ is the adversarial example based on training sample $x_0$. The second part expresses that $x$ is supposed to be misclassified, i.e. the logit $z_i$ for some $i \neq t$ distinct form the true label $t$ is supposed to be larger that the logit $z_t$ corresponding to the true label. This is optimized subject to the constraint that $x$ is a valid image. The attack proposed in [1] assumes a white-box setting were we have access to the logits and the gradients (basically requiring access to the full model). Chen et al., in contrast want to design a black-box attacks. Therefore, they make the following changes: - Instead of using logits $z_i$, the probability distribution $f_i$ (i.e. the actual output of the network) is used. - Gradients are approximated by finite differences. Personally, I find that the first point does violate a strict black-box setting. As company, for example, I would prefer not to give away the full probability distribution but just the final decision (or the decision plus a confidence score). Then, however, the proposed method is not applicable anymore. Anyway, the changed objective looks as follows: $\min_x \|x – x_0\|_2 + c \max\{\max_{i\neq t}\{\log f_i\} – \log f_t, - \kappa\}$ where, according to the authors, the logarithm is essential for optimization. One remaining problem is efficient optimization with finite differences. To this end, they propose a randomized/stochastic coordinate descent algorithm. In particular, in each step, a ranodm pixel is chosen and a local update is performed by calculating the gradient on this pixel using finite differences and performing an ADAM step. [1] N. Carlini, D. Wagner. Towards evaluating the robustness of neural networks. IEEE Symposium of Security and Privacy, 2017. Also view this summary at [davidstutz.de](https://davidstutz.de/category/reading/).
arxiv.org
arxiv-sanity.com
scholar.google.com
ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models
Pin-Yu Chen and Huan Zhang and Yash Sharma and Jinfeng Yi and Cho-Jui Hsieh
arXiv e-Print archive - 2017 via Local arXiv
Keywords: stat.ML, cs.CR, cs.LG

more

Summary by David Stutz 1 month ago
Loading...
Your comment:


ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About

Sponsored by: and