[link]
The paper: "Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask" by Zhou et al., 2019 found that by just learning binary masks one can find random subnetworks that do much better than chance on a task. This new paper builds on this method by proposing a strong algorithm than Zhou et al. for finding these highperforming subnetworks.https://i.imgur.com/vxDqCKP.png The intuition follows: "If a neural network with random weights (center) is sufficiently overparameterized, it will contain a subnetwork (right) that performs as well as a trained neural network (left) with the same number of parameters." While Zhou et al. learned a probability for each weight this paper learns a score for each weight and takes the top k percent at evaluation. The scores are learned through their primary contribution that they call the edgepopup algorithm: https://i.imgur.com/9KcIbxd.png "In the edgepopup Algorithm, we associate a score with each edge. On the forward pass we choose the top edges by score. On the backward pass we update the scores of all the edges with the straightthrough estimator, allowing helpful edges that are “dead” to reenter the subnetwork. *We never update the value of any weight in the network, only the score associated with each weight.*" They're able to find higherperforming random subnetworks than Zhou et al. https://i.imgur.com/T3D7OsZ.png
Your comment:
