All you need is a good init on ShortScience.org

arxiv.org
scholar.google.com

All you need is a good init
Mishkin, Dmytro and Matas, Jiri
arXiv e-Print archive - 2015 via Local Bibsonomy
Keywords: dblp

Summaries/Notes 1

[link] Summary by Dmytro Mishkin 7 years ago

Mean(input) = 0, var(input) =1 is good for learning. Independent input features are good for learning.
So:

1) Pre-Initialize network weights with (approximate) orthonormal matrices

2) Do forward pass with mini-batch

3) Divide layer weights by $\sqrt{var(Output)}$

4) PROFIT!

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private