We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data.
SOTA for CCG Supertagging on CCGBank
Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs.
#2 best model for Machine Translation on WMT2016 English-Russian
An effective method to improve neural machine translation with monolingual data is to augment the parallel training corpus with back-translations of target language sentences.
Common recurrent neural architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations.
#14 best model for Question Answering on SQuAD1.1 dev
Continuous word representations learned separately on distinct languages can be aligned so that their words become comparable in a common space.
Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer still remains a challenge.
While modern machine translation has relied on large parallel corpora, a recent line of work has managed to train Neural Machine Translation (NMT) systems from monolingual corpora only (Artetxe et al., 2018c; Lample et al., 2018).
In natural language processing, a lot of the tasks are successfully solved with recurrent neural networks, but such models have a huge number of parameters.