Neural Responding Machine for Short-Text Conversation Neural Responding Machine for Short-Text Conversation
Paper summary TLDR; The author train a three variants of a seq2seq model to generate a response to social media posts taken from Weibo. The first variant, NRM-glo is the standard model without attention mechanism using the last state as the decoder input. The second variant, NRM-loc, uses an attention mechanism. The third variant, NRM-hyb combines both by concatenating local and global state vectors. The authors use human users to evaluate their responses and compare them to retrievel-based and SMT-based systems. The authors find that SRM models generate reasonable responses ~75% of the time. #### Key Points - STC: Short-text conversation. Generate only a response to a post. Don't need to keep track of a whole conversation. - Training data: 200k posts, 4M responses. - Authors use GRU with 1000 hidden units. - Vocabulary: Most frequent 40k words for both input and response. - Retrieval is done using beam search with beam size 10. - Hybrid model is difficult to train jointly. The authors train the model individually and then fine-tune the hybrid model. - Tradeoff with retrieval based methods: Responses are written by a human and don't have grammatical errors, but cannot easily generalize to unseen inputs.
Neural Responding Machine for Short-Text Conversation
Shang, Lifeng and Lu, Zhengdong and Li, Hang
Association for Computational Linguistics - 2015 via Local Bibsonomy
Keywords: dblp

Summary from Denny Britz
Your comment: allows researchers to publish paper summaries that are voted on and ranked!

Sponsored by: and