Understanding the difficulty of training deep feedforward neural networks Understanding the difficulty of training deep feedforward neural networks
Paper summary The weights at each layer $W$ are initialized based on the number of connections they have. Each $w \in W$ is drawn from a Gaussian distribution with mean $\mu = 0$ with the variance as follows. $$\text{Var}(W) = \frac{2}{n_\text{in}+ n_\text{out}}$$ Where $n_\text{in}$ is the number of neurons in the previous layer from the feedforward direction and $n_\text{out}$ is the number of neurons from the previous layer from the backprop direction. Reference: [Andy Jones's Blog](http://andyljones.tumblr.com/post/110998971763/an-explanation-of-xavier-initialization)
www.jmlr.org
scholar.google.com
Understanding the difficulty of training deep feedforward neural networks
Glorot, Xavier and Bengio, Yoshua
Journal of Machine Learning Research - 2010 via Bibsonomy
Keywords: dblp


Loading...
Your comment:
Loading...
Your comment:


Short Science allows researchers to publish paper summaries that are voted on and ranked!
About