Sequence to Sequence Learning with Neural Networks Sequence to Sequence Learning with Neural Networks
Paper summary This paper presents a simple approach to predicting sequences from sequential input. They use a multi-layer LSTM-based encoder-decoder architecture and show promising results on the task of neural machine translation. Their approach beats a phrase-based statistical machine translation system by a BLEU score of > 1.0 and is close to state-of-the-art if used to re-rank 1000-best predictions from the SMT system. Main contributions: - The first LSTM encodes an input sequence to a single vector, which is then decoded by a second LSTM. End of sequence is indicated by a special character. - 4-layer deep LSTMs. - 160k source vocabulary, 80k target vocabulary. Trained on 12M sentences. Words in output sequence are generated by a softmax over fixed vocabulary. - Beam search is used at test time to predict translations (Beam size 2 does best). ## Strengths - Qualitative results (PCA projections) show that learned representations are fairly insensitive to active/passive voice, as sentences similar in meaning are clustered together. - Another interesting observation was that reversing the source sequence gives a significant boost to translation of long sentences and results in performance gain, most likely due to the introduction of short-term dependencies that are more easily captured by the gradients. ## Weaknesses / Notes - The reversing source input idea needs better justification, otherwise comes across as an 'ugly hack'. - To re-score the n-best list of predictions of the baseline, they average confidences of LSTM and baseline model. They should have reported re-ranking accuracies by using just the LSTM-model confidences.
papers.nips.cc
scholar.google.com
Sequence to Sequence Learning with Neural Networks
Sutskever, Ilya and Vinyals, Oriol and Le, Quoc V.
Neural Information Processing Systems Conference - 2014 via Local Bibsonomy
Keywords: dblp


Loading...
Your comment:
Loading...
Your comment:
Loading...
Your comment:


ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About