Search Results for author: Thales Sales Almeida

Found 6 papers, 3 papers with code

Measuring Cross-lingual Transfer in Bytes

2 code implementations • 12 Apr 2024 • Leandro Rodrigues de Souza, Thales Sales Almeida, Roberto Lotufo, Rodrigo Nogueira

We also found evidence that this transfer is not related to language contamination or language proximity, which strengthens the hypothesis that the model also relies on language-agnostic knowledge.

Cross-Lingual Transfer

198

Paper
Code

Sabiá-2: A New Generation of Portuguese Large Language Models

no code implementations • 14 Mar 2024 • Thales Sales Almeida, Hugo Abonizio, Rodrigo Nogueira, Ramon Pires

We introduce Sabi\'a-2, a family of large language models trained on Portuguese texts.

Math

Paper
Add Code

Evaluating GPT-4's Vision Capabilities on Brazilian University Admission Exams

1 code implementation • 23 Nov 2023 • Ramon Pires, Thales Sales Almeida, Hugo Abonizio, Rodrigo Nogueira

Recent advancements in language models have showcased human-comparable performance in academic entrance exams.

Paper
Code

BLUEX: A benchmark based on Brazilian Leading Universities Entrance eXams

1 code implementation • 11 Jul 2023 • Thales Sales Almeida, Thiago Laitz, Giovana K. Bonás, Rodrigo Nogueira

One common trend in recent studies of language models (LMs) is the use of standardized tests for evaluation.

Natural Language Understanding

Paper
Code

Sabiá: Portuguese Large Language Models

no code implementations • 16 Apr 2023 • Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira

By evaluating on datasets originally conceived in the target language as well as translated ones, we study the contributions of language-specific pretraining in terms of 1) capturing linguistic nuances and structures inherent to the target language, and 2) enriching the model's knowledge about a domain or culture.

Cultural Vocal Bursts Intensity Prediction

Paper
Add Code

NeuralSearchX: Serving a Multi-billion-parameter Reranker for Multilingual Metasearch at a Low Cost

no code implementations • 26 Oct 2022 • Thales Sales Almeida, Thiago Laitz, João Seródio, Luiz Henrique Bonifacio, Roberto Lotufo, Rodrigo Nogueira

We compare our system with Microsoft's Biomedical Search and show that our design choices led to a much cost-effective system with competitive QPS while having close to state-of-the-art results on a wide range of public benchmarks.

Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.