Machine Translation

1959 papers with code • 77 benchmarks • 74 datasets

Machine translation is the task of translating a sentence in a source language to a different target language.

Approaches for machine translation can range from rule-based to statistical to neural-based. More recently, encoder-decoder attention-based architectures like BERT have attained major improvements in machine translation.

One of the most popular datasets used to benchmark machine translation systems is the WMT family of datasets. Some of the most commonly used evaluation metrics for machine translation systems include BLEU, METEOR, NIST, and others.

( Image credit: Google seq2seq )


Use these libraries to find Machine Translation models and implementations
24 papers
15 papers
14 papers
See all 13 libraries.

Most implemented papers

Attention Is All You Need

tensorflow/tensor2tensor NeurIPS 2017

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration.

Neural Machine Translation by Jointly Learning to Align and Translate

graykode/nlp-tutorial 1 Sep 2014

Neural machine translation is a recently proposed approach to machine translation.

Sequence to Sequence Learning with Neural Networks

farizrahman4u/seq2seq NeurIPS 2014

Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

Effective Approaches to Attention-based Neural Machine Translation

philipperemy/keras-attention-mechanism EMNLP 2015

Our ensemble model using different attention architectures has established a new state-of-the-art result in the WMT'15 English to German translation task with 25. 9 BLEU points, an improvement of 1. 0 BLEU points over the existing best system backed by NMT and an n-gram reranker.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

huggingface/transformers arXiv 2019

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP).

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

huggingface/transformers ACL 2020

We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

graykode/nlp-tutorial 3 Jun 2014

In this paper, we propose a novel neural network model called RNN Encoder-Decoder that consists of two recurrent neural networks (RNN).

Convolutional Sequence to Sequence Learning

facebookresearch/fairseq ICML 2017

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks.

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

locuslab/TCN 4 Mar 2018

Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory.

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

NVIDIA/DeepLearningExamples 26 Sep 2016

To improve parallelism and therefore decrease training time, our attention mechanism connects the bottom layer of the decoder to the top layer of the encoder.