no code implementations • 13 Mar 2024 • Benjamin Roth, Pedro Henrique Luz de Araujo, Yuxi Xia, Saskia Kaltenbrunner, Christoph Korab
Machine learning (ML) and artificial intelligence (AI) approaches are often criticized for their inherent bias and for their lack of control, accountability, and transparency.
no code implementations • 14 Nov 2023 • Pedro Henrique Luz de Araujo, Benjamin Roth
A core aspect of our analysis is to measure the effect that including a set of specifications has on a held-out set of unseen, qualitatively different specifications.
1 code implementation • 22 May 2023 • Pedro Henrique Luz de Araujo, Benjamin Roth
In behavioural testing, system functionalities underrepresented in the standard evaluation setting (with a held-out test set) are validated through controlled input-output pairs.
1 code implementation • nlppower (ACL) 2022 • Pedro Henrique Luz de Araujo, Benjamin Roth
Behavioural testing -- verifying system capabilities by validating human-designed input-output pairs -- is an alternative evaluation method of natural language processing systems proposed to address the shortcomings of the standard approach: computing metrics on held-out data.
1 code implementation • 1 Dec 2020 • Pedro Henrique Luz de Araujo, Teófilo Emidio de Campos
The data consist of a corpus of 45, 532 lawsuits manually annotated by the Court’s experts with theme labels, a multi-class and multi-label classification task.
1 code implementation • LREC 2020 • Pedro Henrique Luz de Araujo, Te{\'o}filo Em{\'\i}dio de Campos, Fabricio Ataides Braz, Nilton Correia da Silva
This paper describes VICTOR, a novel dataset built from Brazil{'}s Supreme Court digitalized legal documents, composed of more than 45 thousand appeals, which includes roughly 692 thousand documents{---}about 4. 6 million pages.
Ranked #1 on Multi-Label Text Classification on MVICTOR (theme)
1 code implementation • International Conference on Computational Processing of the Portuguese Language 2020 • Pedro Henrique Luz de Araujo, Teófilo Emidio de Campos, Marcelo Magalhães Silva de Sousa
Official Gazettes are a rich source of relevant information to the public.
Ranked #1 on Text Classification on DODF Data