Bilingual Lexicon Induction

34 papers with code • 0 benchmarks • 0 datasets

Translate words from one language to another.

Latest papers with no code

How Lexical is Bilingual Lexicon Induction?

no code yet • 5 Apr 2024

In contemporary machine learning approaches to bilingual lexicon induction (BLI), a model learns a mapping between the embedding spaces of a language pair.

Self-Augmented In-Context Learning for Unsupervised Word Translation

no code yet • 15 Feb 2024

Recent work has shown that, while large language models (LLMs) demonstrate strong word translation or bilingual lexicon induction (BLI) capabilities in few-shot setups, they still cannot match the performance of 'traditional' mapping-based approaches in the unsupervised scenario where no seed translation pairs are available, especially for lower-resource languages.

Multilingual Word Embeddings for Low-Resource Languages using Anchors and a Chain of Related Languages

no code yet • 21 Nov 2023

In this paper, we propose to build multilingual word embeddings (MWEs) via a novel language chain-based approach, that incorporates intermediate related languages to bridge the gap between the distant source and target.

Quantized Wasserstein Procrustes Alignment of Word Embedding Spaces

no code yet • AMTA 2022

Optimal Transport (OT) provides a useful geometric framework to estimate the permutation matrix under unsupervised cross-lingual word embedding (CLWE) models that pose the alignment task as a Wasserstein-Procrustes problem.

Domain Mismatch Doesn't Always Prevent Cross-Lingual Transfer Learning

no code yet • 30 Nov 2022

Cross-lingual transfer learning without labeled target language data or parallel text has been surprisingly effective in zero-shot cross-lingual classification, question answering, unsupervised machine translation, etc.

Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport

no code yet • 25 Oct 2022

Bilingual lexicons form a critical component of various natural language processing applications, including unsupervised and semisupervised machine translation and crosslingual information retrieval.

Massively Multilingual Lexical Specialization of Multilingual Transformers

no code yet • 1 Aug 2022

While pretrained language models (PLMs) primarily serve as general-purpose text encoders that can be fine-tuned for a wide variety of downstream tasks, recent work has shown that they can also be rewired to produce high-quality word representations (i. e., static word embeddings) and yield good performance in type-level lexical tasks.

Isomorphic Cross-lingual Embeddings for Low-Resource Languages

no code yet • RepL4NLP (ACL) 2022

Following this, we use joint training methods to develops CLWEs for the related language and the target embed-ding space.

Unsupervised Alignment of Distributional Word Embeddings

no code yet • 9 Mar 2022

Cross-domain alignment play a key roles in tasks ranging from machine translation to transfer learning.

Towards a Broad Coverage Named Entity Resource: A Data-Efficient Approach for Many Diverse Languages

no code yet • LREC 2022

Prior work on extracting MNE datasets from parallel corpora required resources such as large monolingual corpora or word aligners that are unavailable or perform poorly for underresourced languages.