Word Alignment

82 papers with code • 7 benchmarks • 4 datasets

Word Alignment is the task of finding the correspondence between source and target words in a pair of sentences that are translations of each other.

Source: Neural Network-based Word Alignment through Score Aggregation

Most implemented papers

Word Translation Without Parallel Data

facebookresearch/MUSE ICLR 2018

We finally describe experiments on the English-Esperanto low-resource language pair, on which there only exists a limited amount of parallel data, to show the potential impact of our method in fully unsupervised machine translation.

Addressing the Rare Word Problem in Neural Machine Translation

atpaino/deep-text-corrector IJCNLP 2015

Our experiments on the WMT14 English to French translation task show that this method provides a substantial improvement of up to 2. 8 BLEU points over an equivalent NMT system that does not use this technique.

SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings

masoudjs/simalign Findings of the Association for Computational Linguistics 2020

We find that alignments created from embeddings are superior for four and comparable for two language pairs compared to those produced by traditional statistical aligners, even with abundant parallel data; e. g., contextualized embeddings achieve a word alignment F1 for English-German that is 5 percentage points higher than eflomal, a high-quality statistical aligner, trained on 100k parallel sentences.

End-to-End Slot Alignment and Recognition for Cross-Lingual NLU

amazon-research/multiatis EMNLP 2020

We introduce MultiATIS++, a new multilingual NLU corpus that extends the Multilingual ATIS corpus to nine languages across four language families, and evaluate our method using the corpus.

Word Alignment by Fine-tuning Embeddings on Parallel Corpora

neulab/awesome-align EACL 2021

In addition, we demonstrate that we are able to train multilingual word aligners that can obtain robust performance on different language pairs.

Sparse Attention with Linear Units

bzhangGo/zero EMNLP 2021

Recently, it has been argued that encoder-decoder models can be made more interpretable by replacing the softmax function in the attention with its sparse variants.

PortaSpeech: Portable and High-Quality Generative Text-to-Speech

natspeech/natspeech NeurIPS 2021

Non-autoregressive text-to-speech (NAR-TTS) models such as FastSpeech 2 and Glow-TTS can synthesize high-quality speech from the given text in parallel.

ChrEnTranslate: Cherokee-English Machine Translation Demo with Quality Estimation and Corrective Feedback

ZhangShiyue/ChrEn ACL 2021

The quantitative evaluation demonstrates that our backbone translation models achieve state-of-the-art translation performance and our quality estimation well correlates with both BLEU and human judgment.

Revitalize Region Feature for Democratizing Video-Language Pre-training of Retrieval

showlab/demovlp 15 Mar 2022

Recent dominant methods for video-language pre-training (VLP) learn transferable representations from the raw pixels in an end-to-end manner to achieve advanced performance on downstream video-language retrieval.

WSPAlign: Word Alignment Pre-training via Large-Scale Weakly Supervised Span Prediction

qiyuw/wspalign 9 Jun 2023

Most existing word alignment methods rely on manual alignment datasets or parallel corpora, which limits their usefulness.