Learning word embeddings using distributional information is a task that has been studied by many researchers, and a lot of studies are reported in the literature.

Paper
Code

Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation

EducationalTestingService/CATS • • 3 Jan 2020

Breaking down the structure of long texts into semantically coherent segments makes the texts more readable and supports downstream applications like summarization and retrieval.

Paper
Code

Refinement of Unsupervised Cross-Lingual Word Embeddings

artetxem/vecmap • 21 Feb 2020

In this paper, we propose a self-supervised method to refine the alignment of unsupervised bilingual word embeddings.

Paper
Code

MultiSeg: Parallel Data and Subword Information for Learning Bilingual Embeddings in Low Resource Scenarios

vishalanand/MultiSeg • WS 2020

Our results show that our method that leverages subword information outperforms the model without subword information, both in intrinsic and extrinsic evaluations of the learned embeddings.

Paper
Code

Cross-Lingual Word Embeddings for Turkic Languages

elmurod1202/crosLingWordEmbTurk • LREC 2020

Our experiments confirm that the obtained bilingual dictionaries outperform previously-available ones, and that word embeddings from a low-resource language can benefit from resource-rich closely-related languages when they are aligned together.

Paper
Code

Learning Contextualised Cross-lingual Word Embeddings and Alignments for Extremely Low-Resource Languages Using Parallel Corpora

twadada/multilingual-nlm • • EMNLP (MRL) 2021

We propose a new approach for learning contextualised cross-lingual word embeddings based on a small parallel corpus (e. g. a few hundred sentence pairs).

Paper
Code

Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval

rlitschk/EncoderCLIR • • 21 Jan 2021

Therefore, in this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a large number of language pairs.

Paper
Code

Cross-Lingual Word Embeddings

Benchmarks Add a Result

Libraries

Most implemented papers

Content

Benchmarks

Add a Result