LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

Laws and their interpretations, legal arguments and agreements\ are typically expressed in writing, leading to the production of vast corpora of legal text. Their analysis, which is at the center of legal practice, becomes increasingly elaborate as these collections grow in size. Natural language understanding (NLU) technologies can be a valuable tool to support legal practitioners in these endeavors. Their usefulness, however, largely depends on whether current state-of-the-art models can generalize across various tasks in the legal domain. To answer this currently open question, we introduce the Legal General Language Understanding Evaluation (LexGLUE) benchmark, a collection of datasets for evaluating model performance across a diverse set of legal NLU tasks in a standardized way. We also provide an evaluation and analysis of several generic and legal-oriented models demonstrating that the latter consistently offer performance improvements across multiple tasks.

PDF Abstract ACL 2022 PDF ACL 2022 Abstract

Datasets


Introduced in the Paper:

LexGLUE

Used in the Paper:

GLUE SuperGLUE ECHR ECtHR CaseHOLD

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Natural Language Understanding LexGLUE BERT ECtHR Task A 71.4 / 64.0 # 1
ECtHR Task B 87.6 / 77.8 # 4
SCOTUS 70.5 / 60.9 # 6
EUR-LEX 71.6 / 55.6 # 5
LEDGAR 87.7 / 82.2 # 5
UNFAIR-ToS 87.5 / 81.0 # 5
CaseHOLD 70.7 # 6
Natural Language Understanding LexGLUE CaseLaw-BERT ECtHR Task A 71.2 / 64.2 # 2
ECtHR Task B 88.0 / 77.5 # 1
SCOTUS 76.4 / 66.2 # 8
EUR-LEX 71.0 / 55.9 # 6
LEDGAR 88.0 / 82.3 # 1
UNFAIR-ToS 88.3 / 81.0 # 1
CaseHOLD 75.6 # 1
Natural Language Understanding LexGLUE Legal-BERT ECtHR Task A 71.2 / 64.6 # 2
ECtHR Task B 88.0 / 77.2 # 1
SCOTUS 76.2 / 65.8 # 1
EUR-LEX 72.2 / 56.2 # 1
LEDGAR 88.1 / 82.7 # 8
UNFAIR-ToS 88.6 / 82.3 # 7
CaseHOLD 75.1 # 2
Natural Language Understanding LexGLUE BigBird ECtHR Task A 70.5 / 63.8 # 4
ECtHR Task B 88.1 / 76.6 # 8
SCOTUS 71.7 / 61.4 # 4
EUR-LEX 71.8 / 56.6 # 3
LEDGAR 87.7 / 82.1 # 5
UNFAIR-ToS 87.7 / 80.2 # 2
CaseHOLD 70.4 # 7
Natural Language Understanding LexGLUE Longformer ECtHR Task A 69.6 / 62.4 # 5
ECtHR Task B 88.0 / 77.8 # 1
SCOTUS 72.2 / 62.5 # 3
EUR-LEX 71.9 / 56.7 # 2
LEDGAR 87.7 / 82.3 # 5
UNFAIR-ToS 87.7 / 80.1 # 2
CaseHOLD 72.0 # 4
Natural Language Understanding LexGLUE DeBERTa ECtHR Task A 69.1 / 61.2 # 7
ECtHR Task B 87.4 / 77.3 # 5
SCOTUS 70.0 / 60.0 # 7
EUR-LEX 72.3 / 57.2 # 8
LEDGAR 87.9 / 82.0 # 3
UNFAIR-ToS 87.2 / 78.8 # 6
CaseHOLD 72.1 # 3
Natural Language Understanding LexGLUE RoBERTa ECtHR Task A 69.5 / 60.7 # 6
ECtHR Task B 87.2 / 77.3 # 6
SCOTUS 70.8 / 61.2 # 5
EUR-LEX 71.8 / 57.5 # 3
LEDGAR 87.9 / 82.1 # 3
UNFAIR-ToS 87.7 / 81.5 # 2
CaseHOLD 71.7 # 5

Methods


No methods listed for this paper. Add relevant methods here