We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data.
SOTA for CCG Supertagging on CCGBank
An effective method to improve neural machine translation with monolingual data is to augment the parallel training corpus with back-translations of target language sentences.
Continuous word representations learned separately on distinct languages can be aligned so that their words become comparable in a common space.
Common recurrent neural architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations.
#6 best model for Machine Translation on WMT2014 English-German
Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs.
#2 best model for Machine Translation on WMT2016 English-Russian
Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer still remains a challenge.
While modern machine translation has relied on large parallel corpora, a recent line of work has managed to train Neural Machine Translation (NMT) systems from monolingual corpora only (Artetxe et al., 2018c; Lample et al., 2018).
In natural language processing, a lot of the tasks are successfully solved with recurrent neural networks, but such models have a huge number of parameters.