Bilingual Lexicon Induction
34 papers with code • 0 benchmarks • 0 datasets
Translate words from one language to another.
Benchmarks
These leaderboards are used to track progress in Bilingual Lexicon Induction
Latest papers with no code
How Lexical is Bilingual Lexicon Induction?
In contemporary machine learning approaches to bilingual lexicon induction (BLI), a model learns a mapping between the embedding spaces of a language pair.
Self-Augmented In-Context Learning for Unsupervised Word Translation
Recent work has shown that, while large language models (LLMs) demonstrate strong word translation or bilingual lexicon induction (BLI) capabilities in few-shot setups, they still cannot match the performance of 'traditional' mapping-based approaches in the unsupervised scenario where no seed translation pairs are available, especially for lower-resource languages.
Multilingual Word Embeddings for Low-Resource Languages using Anchors and a Chain of Related Languages
In this paper, we propose to build multilingual word embeddings (MWEs) via a novel language chain-based approach, that incorporates intermediate related languages to bridge the gap between the distant source and target.
Quantized Wasserstein Procrustes Alignment of Word Embedding Spaces
Optimal Transport (OT) provides a useful geometric framework to estimate the permutation matrix under unsupervised cross-lingual word embedding (CLWE) models that pose the alignment task as a Wasserstein-Procrustes problem.
Domain Mismatch Doesn't Always Prevent Cross-Lingual Transfer Learning
Cross-lingual transfer learning without labeled target language data or parallel text has been surprisingly effective in zero-shot cross-lingual classification, question answering, unsupervised machine translation, etc.
Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport
Bilingual lexicons form a critical component of various natural language processing applications, including unsupervised and semisupervised machine translation and crosslingual information retrieval.
Massively Multilingual Lexical Specialization of Multilingual Transformers
While pretrained language models (PLMs) primarily serve as general-purpose text encoders that can be fine-tuned for a wide variety of downstream tasks, recent work has shown that they can also be rewired to produce high-quality word representations (i. e., static word embeddings) and yield good performance in type-level lexical tasks.
Isomorphic Cross-lingual Embeddings for Low-Resource Languages
Following this, we use joint training methods to develops CLWEs for the related language and the target embed-ding space.
Unsupervised Alignment of Distributional Word Embeddings
Cross-domain alignment play a key roles in tasks ranging from machine translation to transfer learning.
Towards a Broad Coverage Named Entity Resource: A Data-Efficient Approach for Many Diverse Languages
Prior work on extracting MNE datasets from parallel corpora required resources such as large monolingual corpora or word aligners that are unavailable or perform poorly for underresourced languages.