On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models
Sven Gowal and Krishnamurthy Dvijotham and Robert Stanforth and Rudy Bunel and Chongli Qin and Jonathan Uesato and Relja Arandjelovic and Timothy Mann and Pushmeet Kohli
arXiv e-Print archive - 2018 via Local arXiv
Keywords: cs.LG, cs.CR, stat.ML
more

Summaries/Notes 1

[link] Summary by David Stutz 4 years ago

Gowal et al. propose interval bound propagation to obtain certified robustness against adversarial examples. In particular, given a neural network consisting of linear layers and monotonic increasing activation functions, a set of allowed perturbations is propagated to obtain upper and lower bounds at each layer. These lead to bounds on the logits of the network; these are used to verify whether the network changes its prediction on the allowed perturbations. Specifically, Gowal et al. consider an $L_\infty$ ball around input examples; the initial bounds are, thus, $\underline{z}_0 = x - \epsilon$ and $\overline{z}_0 = x + \epsilon$. For each layer, the bounds are defined as

$\underline{z}_{k,i} = \min_{\underline{z}_{k – 1} \leq z_{k – 1} \leq \overline{z}_{k-1}} e_i^T h_k(z_{k – 1})$

and the analogous maximization problem for the upper bound; here, $h$ denotes the applied layer. For Linear layers and monotonic activation functions, this is easy to solve, as shown in the paper. Moreover, computing these bounds is very efficient, only needing roughly two times the computation of one forward pass. During training, a combination of a clean loss and adversarial loss is used:

$\kappa l(z_K, y) + (1 - \kappa) l(\hat{z}_K, y)$

where $z_K$ are the logits of the input $x$, and $\hat{z}_K$ are the adversarial logits computed as

$\hat{Z}_{K,y’} = \begin{cases} \overline{z}_{K,y’} & \text{if } y’ \neq y\\\underline{z}_{K,y} & \text{otherwise}\end{cases}$

Both $\epsilon$ and $\kappa$ are annealed during training. In experiments, it is shown that this method results in quite tight bounds on robustness.

Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private