Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics
Paper summary The authors compare different image recognition models and image data sources for multimodal word representation learning. https://i.imgur.com/iHwCSks.png Image recognition models used for vector generation Experiments are performed on SimLex-999 (similarity) and MEN (relatedness). The performance of different models (AlexNet, GoogLeNet, VGGNet) is found to be quite similar, with VGGNet performing slightly better at the cost of requiring more computation. Using search engines for image sources gives good coverage; ImageNet performs quite well with VGGNet; ESP Game dataset gave the lowest performance. Combining visual and linguistic vectors was found to be beneficial on both English and Italian.
aclweb.org
sci-hub
scholar.google.com
Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics
Kiela, Douwe and Vero, Anita Lilla and Clark, Stephen
Empirical Methods on Natural Language Processing (EMNLP) - 2016 via Local Bibsonomy
Keywords: dblp


Summary by Marek Rei 8 months ago
Loading...
Your comment:


ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About

Sponsored by: and