If you’ve ever read a paper on Generative Adversarial Networks (from now on: GANs), you’ve almost certainly heard the author refer to the scourge upon the land of GANs that is mode collapse. When a generator succumbs to mode collapse, that means that, instead of modeling the full distribution, of input data, it will choose one region where there is a high density of data, and put all of its generated probability weight there. Then, on the next round, the discriminator pushes strongly away from that region (since it now is majority-occupied by fake data), and the generator finds a new mode. In the view of the authors of the Unrolled GANs paper, one reason why this happens is that, in the typical GAN, at each round the generator implicitly assumes that it’s optimizing itself against the final and optimal discriminator. And, so, it makes its best move given that assumption, which is to put all its mass on a region the discriminator assigns high probability. Unfortunately for our short-sighted robot friend, this isn’t a one-round game, and this mass-concentrating strategy gives the discriminator a really good way to find fake data during the next round: just dramatically downweight how likely you think data is in the generator’s prior-round sweet spot, which it’s heavy concentration allows you to do without impacting your assessment of other data. Unrolled GANs operate on this key question: what if we could give the generator an ability to be less short-sighted, and make moves that aren’t just optimizing for the present, but are also defensive against the future, in ways that will hopefully tamp down on this running-around-in-circles dynamic illustrated above. If the generator was incentivized not only to make moves that fool the current discriminator, but also make moves that make the next-step discriminator less likely to tell it apart, the hope is that it will spread out its mass more, and be less likely to fall into the hole of a mode collapse. This intuition was realized in UnrolledGANs, through a mathematical approach that is admittedly a little complex for this discussion format. Essentially, in addition to the typical GAN loss (which is based on the current values of the generator and discriminator), this model also takes one “step forward” of the discriminator (calculates what the new parameters of the discriminator would be, if it took one update step), and backpropogates backward through that step. The loss under the next-step discriminator parameters is a function of both the current generator, and the next-step parameters, which come from the way the discriminator reacts to the current generator. When you take the gradient with respect to the generator of both of these things, you get something very like the ideal we described earlier: a generator that is trying to put its mass into areas the current discriminator sees as high-probability, but also change its parameters such that it gives the discriminator a less effective response strategy. https://i.imgur.com/0eEjm0g.png Empirically: UnrolledGANs do a quite good job at their stated aim of reducing mode collapse, and the unrolled training procedure is now a common building-block technique used in other papers.