Efficient softmax approximation for GPUs Efficient softmax approximation for GPUs
Paper summary Modification of the 2-level hierarchical softmax for better efficiency. An equation of computational complexity is used to find the optimal number of words in each class. In addition, the most common words are considered on the same level as other classes. https://i.imgur.com/dbKS3gh.png

Summary by Marek Rei 2 months ago
Your comment:

ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!

Sponsored by: and