XTREME (Cross-Lingual Transfer Evaluation of Multilingual Encoders)

Introduced by Hu et al. in XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation

The Cross-lingual TRansfer Evaluation of Multilingual Encoders (XTREME) benchmark was introduced to encourage more research on multilingual transfer learning,. XTREME covers 40 typologically diverse languages spanning 12 language families and includes 9 tasks that require reasoning about different levels of syntax or semantics.

The languages in XTREME are selected to maximize language diversity, coverage in existing tasks, and availability of training data. The languages in XTREME are selected to maximize language diversity, coverage in existing tasks, and availability of training data. Among these are many under-studied languages, such as the Dravidian languages Tamil (spoken in southern India, Sri Lanka, and Singapore), Telugu and Malayalam (spoken mainly in southern India), and the Niger-Congo languages Swahili and Yoruba, spoken in Africa.

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages