Word Alignment
84 papers with code • 7 benchmarks • 4 datasets
Word Alignment is the task of finding the correspondence between source and target words in a pair of sentences that are translations of each other.
Source: Neural Network-based Word Alignment through Score Aggregation
Latest papers
Unbalanced Optimal Transport for Unbalanced Word Alignment
Monolingual word alignment is crucial to model semantic interactions between sentences.
Do GPTs Produce Less Literal Translations?
On the task of Machine Translation (MT), multiple works have investigated few-shot prompting mechanisms to elicit better translations from LLMs.
Towards Unsupervised Recognition of Token-level Semantic Differences in Related Documents
Automatically highlighting words that cause semantic differences between two documents could be useful for a wide range of applications.
AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation
Attention is the core mechanism of today's most used architectures for natural language processing and has been analyzed from many perspectives, including its effectiveness for machine translation-related tasks.
Low-resource Bilingual Dialect Lexicon Induction with Large Language Models
Bilingual word lexicons are crucial tools for multilingual natural language understanding and machine translation tasks, as they facilitate the mapping of words in one language to their synonyms in another language.
Meeting the Needs of Low-Resource Languages: The Value of Automatic Alignments via Pretrained Models
However, the languages most in need of automatic alignment are low-resource and, thus, not typically included in the pretraining data.
Multilingual Sentence Transformer as A Multilingual Word Aligner
In this paper, we investigate whether multilingual sentence Transformer LaBSE is a strong multilingual word aligner.
Noisy Parallel Data Alignment
An ongoing challenge in current natural language processing is how its major advancements tend to disproportionately favor resource-rich languages, leaving a significant number of under-resourced languages behind.
Learning To Generate Language-Supervised and Open-Vocabulary Scene Graph Using Pre-Trained Visual-Semantic Space
Specifically, cheap scene graph supervision data can be easily obtained by parsing image language descriptions into semantic graphs.
Frustratingly Easy Label Projection for Cross-lingual Transfer
Translating training data into many languages has emerged as a practical solution for improving cross-lingual transfer.