de-en
33 papers with code • 1 benchmarks • 1 datasets
Most implemented papers
BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning
We also apply BatchEnsemble to lifelong learning, where on Split-CIFAR-100, BatchEnsemble yields comparable performance to progressive neural networks while having a much lower computational and memory costs.
Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule
Several papers argue that wide minima generalize better than narrow minima.
STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework
Simultaneous translation, which translates sentences before they are finished, is useful in many scenarios but is notoriously difficult due to word-order differences.
A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation
The existing machine translation systems, whether phrase-based or neural, have relied almost exclusively on word-level modelling with explicit segmentation.
Fully Character-Level Neural Machine Translation without Explicit Segmentation
We observe that on CS-EN, FI-EN and RU-EN, the quality of the multilingual character-level translation even surpasses the models specifically trained on that language pair alone, both in terms of BLEU score and human judgment.
BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation
The success of bidirectional encoders using masked language models, such as BERT, on numerous natural language processing tasks has prompted researchers to attempt to incorporate these pre-trained models into neural machine translation (NMT) systems.
Hint-Based Training for Non-Autoregressive Machine Translation
Due to the unparallelizable nature of the autoregressive factorization, AutoRegressive Translation (ART) models have to generate tokens sequentially during decoding and thus suffer from high inference latency.
ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation
We propose to train a non-autoregressive machine translation model to minimize the energy defined by a pretrained autoregressive model.
Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation
Given a continuous latent variable model for machine translation (Shu et al., 2020), we train an inference network to approximate the gradient of the marginal log probability of the target sentence, using only the latent variable as input.
Pronoun-Targeted Fine-tuning for NMT with Hybrid Losses
Our sentence-level model shows a 0. 5 BLEU improvement on both the WMT14 and the IWSLT13 De-En testsets, while our contextual model achieves the best results, improving from 31. 81 to 32 BLEU on WMT14 De-En testset, and from 32. 10 to 33. 13 on the IWSLT13 De-En testset, with corresponding improvements in pronoun translation.