Cross-Lingual Bitext Mining

5 papers with code • 4 benchmarks • 1 datasets

Cross-lingual bitext mining is the task of mining sentence pairs that are translations of each other from large text corpora.

Benchmarks

Add a Result

These leaderboards are used to track progress in Cross-Lingual Bitext Mining

Dataset	Best Model	Compare
BUCC German-to-English	Massively Multilingual Sentence Embeddings	See all
BUCC French-to-English	Massively Multilingual Sentence Embeddings	See all
BUCC Russian-to-English	Massively Multilingual Sentence Embeddings	See all
BUCC Chinese-to-English	Massively Multilingual Sentence Embeddings	See all

Libraries

Use these libraries to find Cross-Lingual Bitext Mining models and implementations

facebookresearch/LASER

2 papers

3,519

Datasets

BUCC

Latest papers with no code

Most implemented Social Latest No code

Low-Resource Machine Translation Training Curriculum Fit for Low-Resource Languages

no code yet • 24 Mar 2021

We conduct an empirical study of neural machine translation (NMT) for truly low-resource languages, and propose a training curriculum fit for cases when both parallel training data and compute resource are lacking, reflecting the reality of most of the world's languages and the researchers working on these languages.

Paper
Add Code