Search Results for author: Joel Niklaus

Found 16 papers, 12 papers with code

Towards Explainability and Fairness in Swiss Judgement Prediction: Benchmarking on a Multilingual Dataset

no code implementations26 Feb 2024 Santosh T. Y. S. S, Nina Baumgartner, Matthias Stürmer, Matthias Grabmair, Joel Niklaus

The assessment of explainability in Legal Judgement Prediction (LJP) systems is of paramount importance in building trustworthy and transparent systems, particularly considering the reliance of these systems on factors that may lack legal relevance or involve sensitive attributes.

Benchmarking Cross-Lingual Transfer +2

LegalLens: Leveraging LLMs for Legal Violation Identification in Unstructured Text

1 code implementation6 Feb 2024 Dor Bernsohn, Gil Semo, Yaron Vazana, Gila Hayat, Ben Hagag, Joel Niklaus, Rohit Saha, Kyryl Truskovskyi

In this study, we focus on two main tasks, the first for detecting legal violations within unstructured textual data, and the second for associating these violations with potentially affected individuals.

Experimental Design

Automatic Anonymization of Swiss Federal Supreme Court Rulings

no code implementations7 Oct 2023 Joel Niklaus, Robin Mamié, Matthias Stürmer, Daniel Brunner, Marcel Gygli

Releasing court decisions to the public relies on proper anonymization to protect all involved parties, where necessary.

Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models

1 code implementation22 Aug 2023 Alex Nyffenegger, Matthias Stürmer, Joel Niklaus

Anonymity of both natural and legal persons in court rulings is a critical aspect of privacy protection in the European Union and Switzerland.

SCALE: Scaling up the Complexity for Advanced Language Model Evaluation

2 code implementations15 Jun 2023 Vishvaksenan Rasiah, Ronja Stern, Veton Matoshi, Matthias Stürmer, Ilias Chalkidis, Daniel E. Ho, Joel Niklaus

In this paper, we introduce a novel NLP benchmark that poses challenges to current LLMs across four key dimensions: processing long documents (up to 50K tokens), utilizing domain specific knowledge (embodied in legal texts), multilingual understanding (covering five languages), and multitasking (comprising legal document to document Information Retrieval, Court View Generation, Leading Decision Summarization, Citation Extraction, and eight challenging Text Classification tasks).

Information Retrieval Language Modelling +2

MultiLegalPile: A 689GB Multilingual Legal Corpus

no code implementations3 Jun 2023 Joel Niklaus, Veton Matoshi, Matthias Stürmer, Ilias Chalkidis, Daniel E. Ho

Large, high-quality datasets are crucial for training Large Language Models (LLMs).

MultiLegalSBD: A Multilingual Legal Sentence Boundary Detection Dataset

1 code implementation2 May 2023 Tobias Brugger, Matthias Stürmer, Joel Niklaus

Sentence Boundary Detection (SBD) is one of the foundational building blocks of Natural Language Processing (NLP), with incorrectly split sentences heavily influencing the output quality of downstream tasks.

Boundary Detection Sentence

LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain

1 code implementation30 Jan 2023 Joel Niklaus, Veton Matoshi, Pooja Rani, Andrea Galassi, Matthias Stürmer, Ilias Chalkidis

To provide a fair comparison, we propose two aggregate scores, one based on the datasets and one on the languages.

XLM-R

BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?

no code implementations30 Nov 2022 Joel Niklaus, Daniele Giofré

In specialized domains though (such as legal, scientific or biomedical), models often need to process very long text (sometimes well above 10000 tokens).

Language Modelling

ClassActionPrediction: A Challenging Benchmark for Legal Judgment Prediction of Class Action Cases in the US

1 code implementation1 Nov 2022 Gil Semo, Dor Bernsohn, Ben Hagag, Gila Hayat, Joel Niklaus

The research field of Legal Natural Language Processing (NLP) has been very active recently, with Legal Judgment Prediction (LJP) becoming one of the most extensively studied tasks.

An Empirical Study on Cross-X Transfer for Legal Judgment Prediction

2 code implementations25 Sep 2022 Joel Niklaus, Matthias Stürmer, Ilias Chalkidis

We find that in both settings (legal areas, origin regions), models trained across all groups perform overall better, while they also have improved results in the worst-case scenarios.

Cross-Lingual Transfer Transfer Learning

Swiss-Judgment-Prediction: A Multilingual Legal Judgment Prediction Benchmark

1 code implementation EMNLP (NLLP) 2021 Joel Niklaus, Ilias Chalkidis, Matthias Stürmer

We evaluate state-of-the-art BERT-based methods including two variants of BERT that overcome the BERT input (text) length limitation (up to 512 tokens).

Cannot find the paper you are looking for? You can Submit a new open access paper.