The KLEJ benchmark (Kompleksowa Lista Ewaluacji Językowych) is a set of nine evaluation tasks for the Polish language understanding task.
Key benchmark features:
It contains a diverse set of tasks from different domains and with different objectives.
Most tasks are created from existing datasets but the authors also released the new sentiment analysis dataset from an e-commerce domain.
It includes tasks which have relatively small datasets and require extensive external knowledge to solve them. It promotes the usage of transfer learning instead of training separate models from scratch.
The name KLEJ (English: GLUE) is an abbreviation for Kompleksowa Lista Ewaluacji Językowych (English: Comprehensive List of Language Evaluations) and refers to the GLUE benchmark.