DRAW: A Recurrent Neural Network For Image GenerationDRAW: A Recurrent Neural Network For Image GenerationGregor, Karol and Danihelka, Ivo and Graves, Alex and Rezende, Danilo Jimenez and Wierstra, Daan2015
The paper introduces a sequential variational auto-encoder that generates complex images iteratively. The authors also introduce a new spatial attention mechanism that allows the model to focus on small subsets of the image. This new approach for image generation produces images that can’t be distinguished from the training data.
#### What is DRAW:
The deep recurrent attention writer (DRAW) model has two differences with respect to other variational auto-encoders. First, the encoder and the decoder are recurrent networks. Second, it includes an attention mechanism that restricts the input region observed by the encoder and the output region observed by the decoder.
#### What do we gain?
The resulting images are greatly improved by allowing a conditional and sequential generation. In addition, the spatial attention mechanism can be used in other contexts to solve the “Where to look?” problem.
#### What follows?
A possible extension to this model would be to use a convolutional architecture in the encoder or the decoder. Although this might be less useful since we are already restricting the input of the network.
* As observed in the samples generated by the model, the attention mechanism works effectively by reconstructing images in a local way.
* The attention model is fully differentiable.
* I think a better exposition of the attention mechanism would improve this paper.
This paper introduces a neural network architecture
that generates realistic images sequentially. They
also introduce a differentiable attention mechanism
that allows the network to focus on local regions of the image
during reconstruction. Main contributions:
- The network architecture is similar to other variational
auto-encoders, except that
- The encoder and decoder are recurrent networks (LSTMs).
The encoder's output is conditioned on the decoder's
previous outputs, and the decoder's outputs are iteratively
added to the resulting distribution from which images are
- The spatial attention mechanism restricts the input region
observed by the encoder and available to write for the decoder.
- The spatial soft attention mechanism is effective and fully differentiable,
and can be used for other tasks.
- Images generated by DRAW look very realistic.
## Weaknesses / Notes