…It contains over 10M segments of multilingual open data. The data has been collected from sites allowing free use and reuse of its content, as well as from Public Sector web sites.
2 PAPERS • NO BENCHMARKS YET
…This dataset consists in 4,000 English segments (4,500 tokens) that have been translated into each of 26 low-resource languages, as well as three higher-resource pivot languages (es, fr, hi).
1 PAPER • NO BENCHMARKS YET