We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data.
SOTA for CCG Supertagging on CCGBank
Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs.
#2 best model for Machine Translation on WMT2016 English-Russian
An effective method to improve neural machine translation with monolingual data is to augment the parallel training corpus with back-translations of target language sentences.
SOTA for Machine Translation on WMT2014 English-French (using extra training data)
Common recurrent neural architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations.
#17 best model for Machine Translation on WMT2014 English-German
Continuous word representations learned separately on distinct languages can be aligned so that their words become comparable in a common space.
State-of-the-art natural language processing systems rely on supervision in the form of annotated data to learn competent models.
Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer still remains a challenge.
While modern machine translation has relied on large parallel corpora, a recent line of work has managed to train Neural Machine Translation (NMT) systems from monolingual corpora only (Artetxe et al., 2018c; Lample et al., 2018).
Emotion recognition in conversations is crucial for building empathetic machines.
#2 best model for Emotion Recognition in Conversation on IEMOCAP