Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Paper summary A *Batch Normalization* applied immediately after fully connected layers and adjusts the values of the feedforward output so that they are centered to a zero mean and have unit variance. It has been used by famous Convolutional Neural Networks such as GoogLeNet \cite{journals/corr/SzegedyLJSRAEVR14} and ResNet \cite{journals/corr/HeZRS15}
jmlr.org
scholar.google.com
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Ioffe, Sergey and Szegedy, Christian
International Conference on Machine Learning - 2015 via Bibsonomy
Keywords: dblp


Loading...
Do you have a source for how the normalization works for CNNs? Do you know of any follow-up work which did what you mentioned in "Future work"? (And there is a typo: "archwitecture")

To see effect of batch normalization on CNN, you may refer this benchmark [https://github.com/ducha-aiki/caffenet-benchmark/blob/master/batchnorm.md] Thanks for pointing out the typo :)

Your comment:
Loading...
Your comment:
Loading...
Could you please explain why adding the parameters $\beta$ and $\gamma$ does not change the variance?

What do you mean by "shuffle training examples more thoroughly"?

Your comment:
Loading...
Your comment:


Short Science allows researchers to publish paper summaries that are voted on and ranked!
About