Low-Resource Neural Machine Translation
23 papers with code • 1 benchmarks • 4 datasets
Low-resource machine translation is the task of machine translation on a low-resource language where large data may not be available.
Latest papers
Low-resource neural machine translation with morphological modeling
An attention augmentation scheme to the transformer model is proposed in a generic form to allow integration of pre-trained language models and also facilitate modeling of word order relationships between the source and target languages.
On Bilingual Lexicon Induction with Large Language Models
Bilingual Lexicon Induction (BLI) is a core task in multilingual NLP that still, to a large extent, relies on calculating cross-lingual word representations.
ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation
In this paper, we propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer knowledge from the parent model during the training of the child model.
Low-resource Neural Machine Translation with Cross-modal Alignment
How to achieve neural machine translation with limited parallel data?
On the Complementarity between Pre-Training and Random-Initialization for Resource-Rich Machine Translation
Pre-Training (PT) of text representations has been successfully applied to low-resource Neural Machine Translation (NMT).
Geographical Distance Is The New Hyperparameter: A Case Study Of Finding The Optimal Pre-trained Language For English-isiZulu Machine Translation
Stemming from the limited availability of datasets and textual resources for low-resource languages such as isiZulu, there is a significant need to be able to harness knowledge from pre-trained models to improve low resource machine translation.
Linguistically-driven Multi-task Pre-training for Low-resource Neural Machine Translation
In the present study, we propose novel sequence-to-sequence pre-training objectives for low-resource machine translation (NMT): Japanese-specific sequence to sequence (JASS) for language pairs involving Japanese as the source or target language, and English-specific sequence to sequence (ENSS) for language pairs involving English.
Sicilian Translator: A Recipe for Low-Resource NMT
With 17, 000 pairs of Sicilian-English translated sentences, Arba Sicula developed the first neural machine translator for the Sicilian language.
Rethinking Data Augmentation for Low-Resource Neural Machine Translation: A Multi-Task Learning Approach
Many DA approaches aim at expanding the support of the empirical data distribution by generating new sentence pairs that contain infrequent words, thus making it closer to the true data distribution of parallel sentences.
Meta-Curriculum Learning for Domain Adaptation in Neural Machine Translation
Meta-learning has been sufficiently validated to be beneficial for low-resource neural machine translation (NMT).