All you need is a good initAll you need is a good initMishkin, Dmytro and Matas, Jiri2015

Paper summaryduchaaikiMean(input) = 0, var(input) =1 is good for learning. Independent input features are good for learning.
So:
1) Pre-Initialize network weights with (approximate) orthonormal matrices
2) Do forward pass with mini-batch
3) Divide layer weights by $\sqrt{var(Output)}$
4) PROFIT!

Mean(input) = 0, var(input) =1 is good for learning. Independent input features are good for learning.
So:
1) Pre-Initialize network weights with (approximate) orthonormal matrices
2) Do forward pass with mini-batch
3) Divide layer weights by $\sqrt{var(Output)}$
4) PROFIT!

Your comment:

You must log in before you can post this comment!

You must log in before you can submit this summary! Your draft will not be saved!

Preview:

0

Short Science allows researchers to publish paper summaries that are voted on and ranked! About