Search Results for author: Luiz Bonifacio

Found 8 papers, 8 papers with code

NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation

1 code implementation • 18 Dec 2023 • Nandan Thakur, Luiz Bonifacio, Xinyu Zhang, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Boxing Chen, Mehdi Rezagholizadeh, Jimmy Lin

We measure LLM robustness using two metrics: (i) hallucination rate, measuring model tendency to hallucinate an answer, when the answer is not present in passages in the non-relevant subset, and (ii) error rate, measuring model inaccuracy to recognize relevant passages in the relevant subset.

Hallucination Language Modelling +2

Paper
Code

InPars Toolkit: A Unified and Reproducible Synthetic Data Generation Pipeline for Neural Information Retrieval

1 code implementation • 10 Jul 2023 • Hugo Abonizio, Luiz Bonifacio, Vitor Jeronymo, Roberto Lotufo, Jakub Zavrel, Rodrigo Nogueira

Our toolkit not only reproduces the InPars method and partially reproduces Promptagator, but also provides a plug-and-play functionality allowing the use of different LLMs, exploring filtering methods and finetuning various reranker models on the generated data.

Information Retrieval Retrieval +1

154

Paper
Code

InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval

1 code implementation • 4 Jan 2023 • Vitor Jeronymo, Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Roberto Lotufo, Jakub Zavrel, Rodrigo Nogueira

Recently, InPars introduced a method to efficiently use large language models (LLMs) in information retrieval tasks: via few-shot examples, an LLM is induced to generate relevant queries for documents.

Information Retrieval Retrieval

154

Paper
Code

In Defense of Cross-Encoders for Zero-Shot Retrieval

1 code implementation • 12 Dec 2022 • Guilherme Rosa, Luiz Bonifacio, Vitor Jeronymo, Hugo Abonizio, Marzieh Fadaee, Roberto Lotufo, Rodrigo Nogueira

We find that the number of parameters and early query-document interactions of cross-encoders play a significant role in the generalization ability of retrieval models.

Retrieval

Paper
Code

No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval

1 code implementation • 6 Jun 2022 • Guilherme Moraes Rosa, Luiz Bonifacio, Vitor Jeronymo, Hugo Abonizio, Marzieh Fadaee, Roberto Lotufo, Rodrigo Nogueira

This has made distilled and dense models, due to latency constraints, the go-to choice for deployment in real-world retrieval applications.

Ranked #1 on Citation Prediction on SciDocs (BEIR)

Argument Retrieval Biomedical Information Retrieval +9

Paper
Code

Billions of Parameters Are Worth More Than In-domain Training Data: A case study in the Legal Case Entailment Task

1 code implementation • 30 May 2022 • Guilherme Moraes Rosa, Luiz Bonifacio, Vitor Jeronymo, Hugo Abonizio, Roberto Lotufo, Rodrigo Nogueira

Recent work has shown that language models scaled to billions of parameters, such as GPT-3, perform remarkably well in zero-shot and few-shot scenarios.

Language Modelling

Paper
Code

InPars: Data Augmentation for Information Retrieval using Large Language Models

1 code implementation • 10 Feb 2022 • Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira

In this work, we harness the few-shot capabilities of large pretrained language models as synthetic data generators for IR tasks.

Data Augmentation Information Retrieval +2

154

Paper
Code

mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset

1 code implementation • 31 Aug 2021 • Luiz Bonifacio, Vitor Jeronymo, Hugo Queiroz Abonizio, Israel Campiotti, Marzieh Fadaee, Roberto Lotufo, Rodrigo Nogueira

In this work, we present mMARCO, a multilingual version of the MS MARCO passage ranking dataset comprising 13 languages that was created using machine translation.

Information Retrieval Machine Translation +4

128

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.