no code implementations • 3 Jan 2025 • Roseval Malaquias Junior, Ramon Pires, Thales Sales Almeida, Kenzo Sakiyama, Roseli Romero, Rodrigo Nogueira
To compare general and specialized training, we filtered a web-based dataset to extract legal domain data.
no code implementations • 15 Oct 2024 • Hugo Abonizio, Thales Sales Almeida, Thiago Laitz, Roseval Malaquias Junior, Giovana Kerche Bonás, Rodrigo Nogueira, Ramon Pires
This report presents Sabi\'a-3, our new flagship language model, and Sabiazinho-3, a more cost-effective sibling.
no code implementations • 26 Mar 2024 • Roseval Malaquias Junior, Ramon Pires, Roseli Romero, Rodrigo Nogueira
This study contributes to the growing body of scientific evidence showing that pretraining data selection may enhance the performance of large language models, enabling the exploration of these models at a lower cost.
no code implementations • 14 Mar 2024 • Thales Sales Almeida, Hugo Abonizio, Rodrigo Nogueira, Ramon Pires
We introduce Sabi\'a-2, a family of large language models trained on Portuguese texts.
1 code implementation • 23 Nov 2023 • Ramon Pires, Thales Sales Almeida, Hugo Abonizio, Rodrigo Nogueira
Recent advancements in language models have showcased human-comparable performance in academic entrance exams.
no code implementations • 16 Apr 2023 • Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira
By evaluating on datasets originally conceived in the target language as well as translated ones, we study the contributions of language-specific pretraining in terms of 1) capturing linguistic nuances and structures inherent to the target language, and 2) enriching the model's knowledge about a domain or culture.
1 code implementation • 29 Mar 2023 • Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
The present study aims to explore the capabilities of Language Models (LMs) in tackling high-stakes multiple-choice tests, represented here by the Exame Nacional do Ensino M\'edio (ENEM), a multidisciplinary entrance examination widely adopted by Brazilian universities.
1 code implementation • 14 Jan 2022 • Ramon Pires, Fábio C. de Souza, Guilherme Rosa, Roberto A. Lotufo, Rodrigo Nogueira
A typical information extraction pipeline consists of token- or span-level classification models coupled with a series of pre- and post-processing scripts.
2 code implementations • 22 Mar 2017 • Afonso Menegola, Michel Fornaciali, Ramon Pires, Flávia Vasques Bittencourt, Sandra Avila, Eduardo Valle
Knowledge transfer impacts the performance of deep learning -- the state of the art for image classification tasks, including automated melanoma screening.
no code implementations • 5 Sep 2016 • Afonso Menegola, Michel Fornaciali, Ramon Pires, Sandra Avila, Eduardo Valle
Deep learning is the current bet for image classification.