Learning Factored Representations in a Deep Mixture of ExpertsLearning Factored Representations in a Deep Mixture of ExpertsEigen, David and Ranzato, Marc'Aurelio and Sutskever, Ilya2013

Paper summaryopenreviewThis paper extends the mixture-of-experts (MoE) model by stacking several blocks of the MoEs to form a deep MoE. In this model, each mixture weight is implemented with a gating network. The mixtures at each block is different. The whole deep MoE is trained jointly using the stochastic gradient descent algorithm. The motivation of the work is to reduce the decoding time by exploiting the structure imposed in the MoE model. The model was evaluated on the MNIST and speech monophone classification tasks.

This paper extends the mixture-of-experts (MoE) model by stacking several blocks of the MoEs to form a deep MoE. In this model, each mixture weight is implemented with a gating network. The mixtures at each block is different. The whole deep MoE is trained jointly using the stochastic gradient descent algorithm. The motivation of the work is to reduce the decoding time by exploiting the structure imposed in the MoE model. The model was evaluated on the MNIST and speech monophone classification tasks.

Your comment:

You must log in before you can post this comment!

You must log in before you can submit this summary! Your draft will not be saved!

Preview:

0

Short Science allows researchers to publish paper summaries that are voted on and ranked! About