[link]
This paper demonstrates that Word2Vec \cite{1301.3781} can extract relationships between words and produce latent representations useful for medical data. They explore this model on different datasets which yield different relationships between words. https://i.imgur.com/hSA61Zw.png The Word2Vec model works like an autoencoder that predicts the context of a word. The context of a word is composed of the surrounding words as shown below. Given the word in the center the neighboring words are predicted through a bottleneck in the autoencoder. A word has many contexts in a corpus so the model can never have 0 error. The model must minimize the reconstruction which is how it learns the latent representation. https://i.imgur.com/EMtjTHn.png Subjectively we can observe the relationship between word vectors: https://i.imgur.com/8C9EVq1.png
Your comment:
|