Enriching Word Vectors with Subword Information on ShortScience.org

scholar.google.com

Enriching Word Vectors with Subword Information
Bojanowski, Piotr and Grave, Edouard and Joulin, Armand and Mikolov, Tomas
Transactions of the Association for Computational Linguistics - 2017 via Local Bibsonomy
Keywords: fasttext, pretrained, wikipedia, word2vec, vectors

Summaries/Notes 2

[link] Summary by Marek Rei 6 years ago

They extend skip-grams for word embeddings to use character n-grams. Each word is represented as a bag of character n-grams, 3-6 characters long, plus the word itself. Each of these has their own embedding which gets optimised to predict the surrounding context words using skip-gram optimisation. They evaluate on word similarity and analogy tasks, in different languages, and show improvement on most benchmarks.

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private