Numerically Grounded Language Models for Semantic Error CorrectionNumerically Grounded Language Models for Semantic Error CorrectionSpithourakis, Georgios P. and Augenstein, Isabelle and Riedel, Sebastian2016
Paper summarymarekThey create an LSTM neural language model that 1) has better handling of numerical values, and 2) is conditioned on a knowledge base.
First the the numerical value each token is given as an additional signal to the network at each time step. While we normally represent token “25” as a normal word embedding, we now also have an extra feature with numerical value float(25). Second, they condition the language model on text in a knowledge base. All the information in the KB is converted to a string, passed through an LSTM and then used to condition the main LM.
They evaluate on a dataset of 16,003 clinical records which come paired with small KB tuples of 20 possible attributes. The numerical grounding helps quite a bit, and the best results are obtained when the KB conditioning is also added.