WikiMatrix is a dataset of parallel sentences in the textual content of Wikipedia for all possible language pairs. The mined data consists of:
86 PAPERS • NO BENCHMARKS YET