Bilingual Lexicon Induction
34 papers with code • 0 benchmarks • 0 datasets
Translate words from one language to another.
Benchmarks
These leaderboards are used to track progress in Bilingual Lexicon Induction
Most implemented papers
Hubless Nearest Neighbor Search for Bilingual Lexicon Induction
Recent advances in BLI work by aligning the two word embedding spaces.
Bilingual Lexicon Induction through Unsupervised Machine Translation
A recent research line has obtained strong results on bilingual lexicon induction by aligning independently trained word embeddings in two languages and using the resulting cross-lingual embeddings to induce word translation pairs through nearest neighbor or related retrieval methods.
Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces
We then propose Bilingual Lexicon Induction with Semi-Supervision (BLISS) --- a semi-supervised approach that relaxes the isometric assumption while leveraging both limited aligned bilingual lexicons and a larger set of unaligned word embeddings, as well as a novel hubness filtering technique.
Do We Really Need Fully Unsupervised Cross-Lingual Embeddings?
A series of bilingual lexicon induction (BLI) experiments with 15 diverse languages (210 language pairs) show that fully unsupervised CLWE methods still fail for a large number of language pairs (e. g., they yield zero BLI performance for 87/210 pairs).
Refinement of Unsupervised Cross-Lingual Word Embeddings
In this paper, we propose a self-supervised method to refine the alignment of unsupervised bilingual word embeddings.
Semi-Supervised Bilingual Lexicon Induction with Two-way Interaction
In this paper, we propose a new semi-supervised BLI framework to encourage the interaction between the supervised signal and unsupervised alignment.
Learning Contextualised Cross-lingual Word Embeddings and Alignments for Extremely Low-Resource Languages Using Parallel Corpora
We propose a new approach for learning contextualised cross-lingual word embeddings based on a small parallel corpus (e. g. a few hundred sentence pairs).
Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation
Successful methods for unsupervised neural machine translation (UNMT) employ crosslingual pretraining via self-supervision, often in the form of a masked language modeling or a sequence generation task, which requires the model to align the lexical- and high-level representations of the two languages.
Cross-Lingual Word Embedding Refinement by $\ell_{1}$ Norm Optimisation
It is therefore recommended that this strategy be adopted as a standard for CLWE methods.
Cross-Lingual Word Embedding Refinement by $\ell_1$ Norm Optimisation
It is therefore recommended that this strategy be adopted as a standard for CLWE methods.