GradNets: Dynamic Interpolation Between Neural Architectures GradNets: Dynamic Interpolation Between Neural Architectures
Paper summary A common setting in deep networks is to design the network first, "freeze" the network architecture, and then train the parameters. The paper pointed out a potential dilemma of that, in the sense that complex networks may have better representation power but may be hard to train. To address this issue the paper proposed to train the network in a hybrid fashion where simpler components and more complex components are combined via a weight average, and the weight is updated over the training procedure to introduce the more complex components, while utilizing the fast training capability of simpler ones. The authors propose to blend any two architectural components as the time of optimisation progresses. As the time progresses, the initial approach, e.g. employed rectifier, is gradually switched off in place of another rectifier. The authors claim that this strategy is good for a fast convergence and they present some experimental results.
arxiv.org
scholar.google.com
GradNets: Dynamic Interpolation Between Neural Architectures
Almeida, Diogo and Sauder, Nate
arXiv e-Print archive - 2015 via Bibsonomy
Keywords: dblp


Loading...
Your comment:


Short Science allows researchers to publish paper summaries that are voted on and ranked!
About