[link]
Madry et al. provide an interpretation of training on adversarial examples as sattlepoint (i.e. minmax) problem. Based on this formulation, they conduct several experiments on MNIST and CIFAR10 supporting the following conclusions:  Projected gradient descent might be “strongest” adversary using firstorder information. Here, gradient descent is used to maximize the loss of the classifier directly while always projecting onto the set of “allowed” perturbations (e.g. within an $\epsilon$ball around the samples). This observation is based on a large number of random restarts used for projected gradient descent. Regarding the number of restarts, the authors also note that an adversary should be bounded regarding the computation resources – similar to polynomially bounded adversaries in cryptography.  Network capacity plays an important role in training robust neural networks using the minmax formulation (i.e. using adversarial training). In particular, the authors suggest that increased capacity is needed to fit/learn adversarial examples without overfitting. Additionally, increased capacity (in combination with a strong adversary) decreases transferability of adversarial examples. Also view this summary at [davidstutz.de](https://davidstutz.de/category/reading/).
Your comment:
