[link]
Kanbak et al. propose ManiFool, a method to determine a network’s invariance to transformations by iteratively finding adversarial transformations. In particular, given a class of transformations to consider, ManiFool iteratively alternates two steps. First, a gradient step is taken in order to move into an adversarial direction; then, the obtained perturbation/direction is projected back to the space of allowed transformations. While the details are slightly more involved, I found that this approach is similar to the general projected gradient ascent approach to finding adversarial examples. By finding worstcase transformations for a set of test samples, Kanbak et al. Are able to quantify the invariance of a network against specific transformations. Furthermore, they show that adversarial finetuning using the found adversarial transformations allows to boost invariance, while only incurring a small loss in general accuracy. Examples of the found adversarial transformations are shown in Figure 1. https://i.imgur.com/h83RdE8.png Figure 1: The proposed attack method allows to consider different classes of transformations as shown in these examples. Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).
Your comment:
