Categorical Reparameterization with Gumbel-Softmax Categorical Reparameterization with Gumbel-Softmax
Paper summary In [stochastic computation graphs][scg], like [variational autoencoders][vae], using discrete variables is hard because we can't just differentiate through Monte Carlo estimates. This paper introduces a distribution that is a smoothed version of the [categorical distribution][cat] and has a parameter that, as it goes to zero, will make it equal the categorical distribution. This distribution is continuous and can be reparameterised. In other words, the Gumbel trick way to sample a categorical $z$ looks like this ($g_i$ is gumbel distributed and $\boldsymbol{\pi}/\sum_j \pi_j$ are the categorical probabilties): $$ z = \text{one_hot} \left( \underset{i}{\text{arg max}} [ g_i + \log \pi_i ] \right) $$ This paper replaces the one hot and argmax with a [softmax][], and they introduce $\tau$ to control the "discreteness": $$ z = \text{softmax} \left( \frac{ g_i + \log \pi_i}{\tau} \right) $$ I made a [notebook that illustrates this][nb] while looking at another paper that came out at the same time, which I should probably compare against here. Comparison with [Concrete Distribution][concrete] --------------------------------------------------------------- The concrete and Gumbel-softmax distributions are exactly the same (notation switch: $\tau \to \lambda$, $\pi_i \to \alpha_k$, $G_k \to g_i$). Both papers have structured output prediction experiments (predict one half of MNIST digits from the other half). This paper shows Gumbel-softmax always being better, but doesn't compare to VIMCO, which is sometimes better at test time in the concrete distribution paper. Sidenote - blog post ---------------------------- The authors posted a [nice blog post][blog] that is also a good short summary and explanation. [blog]: [scg]: [vae]: [cat]: [softmax]: [concrete]: [nb]:
Categorical Reparameterization with Gumbel-Softmax
Jang, Eric and Gu, Shixiang and Poole, Ben
arXiv e-Print archive - 2016 via Bibsonomy
Keywords: dblp

Your comment: allows researchers to publish paper summaries that are voted on and ranked!