UPSET and ANGRI : Breaking High Performance Image ClassifiersUPSET and ANGRI : Breaking High Performance Image ClassifiersSarkar, Sayantan and Bansal, Ankan and Mahbub, Upal and Chellappa, Rama2017
Paper summarydavidstutzSarkar et al. propose two “learned” adversarial example attacks, UPSET and ANGRI. The former, UPSET, learns to predict universal, targeted adversarial examples. The latter, ANGRI, learns to predict (non-universal) targeted adversarial attacks. For UPSET, a network takes the target label as input and learns to predict a perturbation, which added to the original image results in mis-classification; for ANGRI, a network takes both the target label and the original image as input to predict a perturbation. These networks are then trained using a mis-classification loss while also minimizing the norm of the perturbation. To this end, the target classifier needs to be differentiable – i.e., UPSET and ANGRI require white-box access.
Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).