Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build.
Our ensemble model using different attention architectures has established a new state-of-the-art result in the WMT'15 English to German translation task with 25. 9 BLEU points, an improvement of 1. 0 BLEU points over the existing best system backed by NMT and an n-gram reranker.
SOTA for Machine Translation on 20NEWS (Accuracy metric )
Moreover, we extend the n-gram convolution to non-consecutive words to recognize patterns with intervening words.
We present extensions to a continuous-state dependency parsing method that makes it applicable to morphologically rich languages.
Most existing word embedding methods can be categorized into Neural Embedding Models and Matrix Factorization (MF)-based methods.
We introduce a model for constructing vector representations of words by composing characters using bidirectional LSTMs.
#3 best model for Part-Of-Speech Tagging on Penn Treebank