KLUE (Korean Language Understanding Evaluation)

Introduced by Park et al. in KLUE: Korean Language Understanding Evaluation

Korean Language Understanding Evaluation (KLUE) benchmark is a series of datasets to evaluate natural language understanding capability of Korean language models. KLUE consists of 8 diverse and representative tasks, which are accessible to anyone without any restrictions. With ethical considerations in mind, we deliberately design annotation guidelines to obtain unambiguous annotations for all datasets. Furthermore, we build an evaluation system and carefully choose evaluations metrics for every task, thus establishing fair comparison across Korean language models.

KLUE benchmark is composed of 8 tasks:

  • Topic Classification (TC)
  • Sentence Textual Similarity (STS)
  • Natural Language Inference (NLI)
  • Named Entity Recognition (NER)
  • Relation Extraction (RE)
  • (Part-Of-Speech) + Dependency Parsing (DP)
  • Machine Reading Comprehension (MRC)
  • Dialogue State Tracking (DST)


Paper Code Results Date Stars


Similar Datasets


  • Unknown