What's Hidden in a Randomly Weighted Neural Network? on ShortScience.org

arxiv.org

What's Hidden in a Randomly Weighted Neural Network?
Ramanujan, Vivek and Wortsman, Mitchell and Kembhavi, Aniruddha and Farhadi, Ali and Rastegari, Mohammad
- 2019 via Local Bibsonomy
Keywords: deep-learning, readings, generalization, theory

Summaries/Notes 1

[link] Summary by devin132 4 years ago

The paper: "Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask" by Zhou et al., 2019 found that by just learning binary masks one can find random subnetworks that do much better than chance on a task. This new paper builds on this method by proposing a strong algorithm than Zhou et al. for finding these high-performing subnetworks.https://i.imgur.com/vxDqCKP.png

The intuition follows: "If a neural network with random weights (center) is sufficiently overparameterized, it will contain a subnetwork (right) that performs as well as a trained neural network (left) with the same number of parameters."

While Zhou et al. learned a probability for each weight this paper learns a score for each weight and takes the top k percent at evaluation. The scores are learned through their primary contribution that they call the edge-popup algorithm: 

https://i.imgur.com/9KcIbxd.png

"In the edge-popup Algorithm, we associate a score with each edge. On the forward pass we choose the top edges by score. On the backward pass we update the scores of all the edges with the straight-through estimator, allowing helpful edges that are “dead” to re-enter the subnetwork. *We never update the value of any weight in the network, only the score associated with each weight.*"

They're able to find higher-performing random subnetworks than Zhou et al.

https://i.imgur.com/T3D7OsZ.png

Your comment: