Learning Factored Representations in a Deep Mixture of Experts Learning Factored Representations in a Deep Mixture of Experts
Paper summary This paper extends the mixture-of-experts (MoE) model by stacking several blocks of the MoEs to form a deep MoE. In this model, each mixture weight is implemented with a gating network. The mixtures at each block is different. The whole deep MoE is trained jointly using the stochastic gradient descent algorithm. The motivation of the work is to reduce the decoding time by exploiting the structure imposed in the MoE model. The model was evaluated on the MNIST and speech monophone classification tasks.
arxiv.org
scholar.google.com
Learning Factored Representations in a Deep Mixture of Experts
Eigen, David and Ranzato, Marc'Aurelio and Sutskever, Ilya
arXiv e-Print archive - 2013 via Bibsonomy
Keywords: dblp


Loading...
Your comment:


Short Science allows researchers to publish paper summaries that are voted on and ranked!
About