Certified Adversarial Robustness via Randomized SmoothingCertified Adversarial Robustness via Randomized SmoothingJeremy M Cohen and Elan Rosenfeld and J. Zico Kolter2019
Paper summarydavidstutzCohen et al. study robustness bounds of randomized smoothing, a region-based classification scheme where the prediction is averaged over Gaussian samples around the test input. Specifically, given a test input, the predicted class is the class whose decision region has the largest overlap with a normal distribution of pre-defined variance. The intuition of this approach is that, for small perturbations, the decision regions of classes can’t vary too much. In practice, randomized smoothing is applied using samples. In the paper, Cohen et al. show that this approach conveys robustness against radii R depending on the confidence difference between the actual class and the “runner-up” class. In practice, the radii also depend on the number of samples used.
Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).
First published: 2019/02/08 (6 months ago) Abstract: We show how to turn any classifier that classifies well under Gaussian noise
into a new classifier that is certifiably robust to adversarial perturbations
under the $\ell_2$ norm. This "randomized smoothing" technique has been
proposed recently in the literature, but existing guarantees are loose. We
prove a tight robustness guarantee in $\ell_2$ norm for smoothing with Gaussian
noise. We use randomized smoothing to obtain an ImageNet classifier with e.g. a
certified top-1 accuracy of 49% under adversarial perturbations with $\ell_2$
norm less than 0.5 (=127/255). No certified defense has been shown feasible on
ImageNet except for smoothing. On smaller-scale datasets where competing
approaches to certified $\ell_2$ robustness are viable, smoothing delivers
higher certified accuracies. Our strong empirical results suggest that
randomized smoothing is a promising direction for future research into
adversarially robust classification. Code and models are available at