The ELITR ECA Corpus

15 Sep 2021  ·  Philip Williams, Barry Haddow ·

We present the ELITR ECA corpus, a multilingual corpus derived from publications of the European Court of Auditors. We use automatic translation together with Bleualign to identify parallel sentence pairs in all 506 translation directions. The result is a corpus comprising 264k document pairs and 41.9M sentence pairs.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


Introduced in the Paper:

ELITR ECA

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here