TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Information Retrieval	BSARD	Two-tower Bi-Encoder (RoBERTa)	Recall@100	74.78	# 1
Information Retrieval	BSARD	Two-tower Bi-Encoder (RoBERTa)	Recall@200	78.04	# 2
Information Retrieval	BSARD	Two-tower Bi-Encoder (RoBERTa)	Recall@500	83.39	# 2
Information Retrieval	BSARD	Siamese Bi-Encoder (RoBERTa)	Recall@100	71.63	# 2
Information Retrieval	BSARD	Siamese Bi-Encoder (RoBERTa)	Recall@200	78.38	# 1
Information Retrieval	BSARD	Siamese Bi-Encoder (RoBERTa)	Recall@500	83.77	# 1
Information Retrieval	BSARD	BM25	Recall@100	51.33	# 3
Information Retrieval	BSARD	BM25	Recall@200	56.78	# 3
Information Retrieval	BSARD	BM25	Recall@500	64.71	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-statutory-article-retrieval-dataset-in/information-retrieval-on-bsard)](https://paperswithcode.com/sota/information-retrieval-on-bsard?p=a-statutory-article-retrieval-dataset-in)`

A Statutory Article Retrieval Dataset in French

ACL 2022 · Antoine Louis, Gerasimos Spanakis ·

Statutory article retrieval is the task of automatically retrieving law articles relevant to a legal question. While recent advances in natural language processing have sparked considerable interest in many legal tasks, statutory article retrieval remains primarily untouched due to the scarcity of large-scale and high-quality annotated datasets. To address this bottleneck, we introduce the Belgian Statutory Article Retrieval Dataset (BSARD), which consists of 1,100+ French native legal questions labeled by experienced jurists with relevant articles from a corpus of 22,600+ Belgian law articles. Using BSARD, we benchmark several state-of-the-art retrieval approaches, including lexical and dense architectures, both in zero-shot and supervised setups. We find that fine-tuned dense retrieval models significantly outperform other systems. Our best performing baseline achieves 74.8% R@100, which is promising for the feasibility of the task and indicates there is still room for improvement. By the specificity of the domain and addressed task, BSARD presents a unique challenge problem for future research on legal information retrieval. Our dataset and source code are publicly available.

PDF Abstract ACL 2022 PDF ACL 2022 Abstract

Code

Add Remove Mark official

maastrichtlawtech/bsard official

Tasks

Add Remove

Information Retrieval

Retrieval

Specificity

Datasets

Introduced in the Paper:

BSARD

Results from the Paper

Edit

Ranked #1 on Information Retrieval on BSARD

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Information Retrieval	BSARD	Two-tower Bi-Encoder (RoBERTa)	Recall@100	74.78	# 1	Compare
			Recall@200	78.04	# 2	Compare
			Recall@500	83.39	# 2	Compare
Information Retrieval	BSARD	Siamese Bi-Encoder (RoBERTa)	Recall@100	71.63	# 2	Compare
			Recall@200	78.38	# 1	Compare
			Recall@500	83.77	# 1	Compare
Information Retrieval	BSARD	BM25	Recall@100	51.33	# 3	Compare
			Recall@200	56.78	# 3	Compare
			Recall@500	64.71	# 3	Compare

Methods

Add Remove

CBoW Word2Vec • fastText • RoBERTa

Edit Social Preview

A Statutory Article Retrieval Dataset in French

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove