Search Results for author: Carlos Rodriguez-Penagos

Found 5 papers, 1 papers with code

Overview of BioASQ 2020: The eighth BioASQ challenge on Large-Scale Biomedical Semantic Indexing and Question Answering

no code implementations • 28 Jun 2021 • Anastasios Nentidis, Anastasia Krithara, Konstantinos Bougiatiotis, Martin Krallinger, Carlos Rodriguez-Penagos, Marta Villegas, Georgios Paliouras

In this paper, we present an overview of the eighth edition of the BioASQ challenge, which ran as a lab in the Conference and Labs of the Evaluation Forum (CLEF) 2020.

Question Answering

Paper
Add Code

Are Multilingual Models the Best Choice for Moderately Under-resourced Languages? A Comprehensive Assessment for Catalan

no code implementations • Findings (ACL) 2021 • Jordi Armengol-Estapé, Casimiro Pio Carrino, Carlos Rodriguez-Penagos, Ona de Gibert Bonet, Carme Armentano-Oller, Aitor Gonzalez-Agirre, Maite Melero, Marta Villegas

For this, we: (1) build a clean, high-quality textual Catalan corpus (CaText), the largest to date (but only a fraction of the usual size of the previous work in monolingual language models), (2) train a Transformer-based language model for Catalan (BERTa), and (3) devise a thorough evaluation in a diversity of settings, comprising a complete array of downstream tasks, namely, Part of Speech Tagging, Named Entity Recognition and Classification, Text Classification, Question Answering, and Semantic Textual Similarity, with most of the corresponding datasets being created ex novo.

Language Modelling named-entity-recognition +7

Paper
Add Code

The Catalan Language CLUB

no code implementations • 3 Dec 2021 • Carlos Rodriguez-Penagos, Carme Armentano-Oller, Marta Villegas, Maite Melero, Aitor Gonzalez, Ona de Gibert Bonet, Casimiro Carrino Pio

The Catalan Language Understanding Benchmark (CLUB) encompasses various datasets representative of different NLU tasks that enable accurate evaluations of language models, following the General Language Understanding Evaluation (GLUE) example.

Paper
Add Code

ParlamentParla: A Speech Corpus of Catalan Parliamentary Sessions

no code implementations • ParlaCLARIN (LREC) 2022 • Baybars Kulebi, Carme Armentano-Oller, Carlos Rodriguez-Penagos, Marta Villegas

This corpus has already been used in training of state-of-the-art ASR systems, and proof-of-concept text-to-speech (TTS) models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

MarIA: Spanish Language Models

2 code implementations • 15 Jul 2021 • Asier Gutiérrez-Fandiño, Jordi Armengol-Estapé, Marc Pàmies, Joan Llop-Palao, Joaquín Silveira-Ocampo, Casimiro Pio Carrino, Aitor Gonzalez-Agirre, Carme Armentano-Oller, Carlos Rodriguez-Penagos, Marta Villegas

This work presents MarIA, a family of Spanish language models and associated resources made available to the industry and the research community.

Extractive Question-Answering Question Answering

243

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.