[link]
Brendel et al. propose a decisionbased blackbox attacks against (deep convolutional) neural networks. Specifically, the socalled Boundary Attack starts with a random adversarial example (i.e. random noise that is not classified as the image to be attacked) and randomly perturbs this initialization to move closer to the target image while remaining misclassified. In pseudo code, the algorithm is described in Algorithm 1. Key component is the proposal distribution $P$ used to guide the adversarial perturbation in each step. In practice, they use a maximumentropy distribution (e.g. uniform) with a couple of constraints: the perturbed sample is a valid image; the perturbation has a specified relative size, i.e. $\\eta^k\_2 = \delta d(o, \tilde{o}^{k1})$; and the perturbation reduces the distance to the target image $o$: $d(o, \tilde{o}^{k1}) – d(o,\tilde{o}^{k1} + \eta^k)=\epsilon d(o, \tilde{o}^{k1})$. This is approximated by sampling from a standard Gaussian, clipping and rescaling and projecting the perturbation onto the $\epsilon$sphere around the image. In experiments, they show that this attack is competitive to whitebox attacks and can attack realworld systems. https://i.imgur.com/BmzhiFP.png Algorithm 1: Minimal pseudo code version of the boundary attack. Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).
Your comment:
