1 code implementation • LREC 2016 • Marta Villegas, Maite Melero, N{\'u}ria Bel, Jorge Gracia
The experiments presented here exploit the properties of the Apertium RDF Graph, principally cycle density and nodes{'} degree, to automatically generate new translation relations between words, and therefore to enrich existing bilingual dictionaries with new entries.
2 code implementations • LREC 2022 • Jordi Armengol-Estapé, Ona de Gibert Bonet, Maite Melero
Generative Pre-trained Transformers (GPTs) have recently been scaled to unprecedented sizes in the history of machine learning.
1 code implementation • WMT (EMNLP) 2021 • Ksenia Kharitonova, Ona de Gibert Bonet, Jordi Armengol-Estapé, Mar Rodriguez i Alvarez, Maite Melero
This paper describes the participation of the BSC team in the WMT2021’s Multilingual Low-Resource Translation for Indo-European Languages Shared Task.
1 code implementation • 14 Feb 2022 • Ona de Gibert, Ksenia Kharitonova, Blanca Calvo Figueras, Jordi Armengol-Estapé, Maite Melero
In this work, we introduce sequence-to-sequence language resources for Catalan, a moderately under-resourced language, towards two tasks, namely: Summarization and Machine Translation (MT).
no code implementations • 19 Mar 2018 • Marta R. Costa-jussà, Noe Casas, Maite Melero
This paper describes the methodology followed to build a neural machine translation system in the biomedical domain for the English-Catalan language pair.
no code implementations • LREC 2014 • Georg Rehm, Hans Uszkoreit, Sophia Ananiadou, N{\'u}ria Bel, Audron{\.e} Bielevi{\v{c}}ien{\.e}, Lars Borin, Ant{\'o}nio Branco, Gerhard Budin, Nicoletta Calzolari, Walter Daelemans, Radovan Garab{\'\i}k, Marko Grobelnik, Carmen Garc{\'\i}a-Mateo, Josef van Genabith, Jan Haji{\v{c}}, Inma Hern{\'a}ez, John Judge, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lind{\'e}n, Bernardo Magnini, Joseph Mariani, John McNaught, Maite Melero, Monica Monachini, Asunci{\'o}n Moreno, Jan Odijk, Maciej Ogrodniczuk, Piotr P{\k{e}}zik, Stelios Piperidis, Adam Przepi{\'o}rkowski, Eir{\'\i}kur R{\"o}gnvaldsson, Michael Rosner, Bolette Pedersen, Inguna Skadi{\c{n}}a, Koenraad De Smedt, Marko Tadi{\'c}, Paul Thompson, Dan Tufi{\c{s}}, Tam{\'a}s V{\'a}radi, Andrejs Vasi{\c{l}}jevs, Kadri Vider, Jolanta Zabarskaite
This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics.
no code implementations • LREC 2014 • Marta Villegas, Maite Melero, N{\'u}ria Bel
The proliferation of different metadata schemas and models pose serious problems of interoperability.
no code implementations • LREC 2012 • Eleftherios Avramidis, Marta R. Costa-juss{\`a}, Christian Federmann, Josef van Genabith, Maite Melero, Pavel Pecina
This corpus aims to serve as a basic resource for further research on whether hybrid machine translation algorithms and system combination techniques can benefit from additional (linguistically motivated, decoding, and runtime) information provided by the different systems involved.
no code implementations • LREC 2012 • Maite Melero, Marta R. Costa-juss{\`a}, Judith Domingo, Montse Marquina, Mart{\'\i} Quixal
We present work in progress aiming to build tools for the normalization of User-Generated Content (UGC).
no code implementations • LREC 2012 • Christian Federmann, Eleftherios Avramidis, Marta R. Costa-juss{\`a}, Josef van Genabith, Maite Melero, Pavel Pecina
We describe the Shared Task on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid Machine Translation (ML4HMT) which aims to foster research on improved system combination approaches for machine translation (MT).
no code implementations • LREC 2020 • Thierry Etchegoyhen, Borja Anza Porras, Andoni Azpeitia, Eva Mart{\'\i}nez Garcia, Jos{\'e} Luis Fonseca, Patricia Fonseca, Paulo Vale, Jane Dunne, Federico Gaspari, Teresa Lynn, Helen McHugh, Andy Way, Victoria Arranz, Khalid Choukri, Herv{\'e} Pusset, Alex Sicard, re, Rui Neto, Maite Melero, David Perez, Ant{\'o}nio Branco, Ruben Branco, Lu{\'\i}s Gomes
We describe the European Language Resource Infrastructure (ELRI), a decentralised network to help collect, prepare and share language resources.
no code implementations • Findings (ACL) 2021 • Jordi Armengol-Estapé, Casimiro Pio Carrino, Carlos Rodriguez-Penagos, Ona de Gibert Bonet, Carme Armentano-Oller, Aitor Gonzalez-Agirre, Maite Melero, Marta Villegas
For this, we: (1) build a clean, high-quality textual Catalan corpus (CaText), the largest to date (but only a fraction of the usual size of the previous work in monolingual language models), (2) train a Transformer-based language model for Catalan (BERTa), and (3) devise a thorough evaluation in a diversity of settings, comprising a complete array of downstream tasks, namely, Part of Speech Tagging, Named Entity Recognition and Classification, Text Classification, Question Answering, and Semantic Textual Similarity, with most of the corresponding datasets being created ex novo.
no code implementations • EAMT 2020 • Ēriks Ajausks, Victoria Arranz, Laurent Bié, Aleix Cerdà-i-Cucó, Khalid Choukri, Montse Cuadros, Hans Degroote, Amando Estela, Thierry Etchegoyhen, Mercedes García-Martínez, Aitor García-Pablos, Manuel Herranz, Alejandro Kohan, Maite Melero, Mike Rosner, Roberts Rozis, Patrick Paroubek, Artūrs Vasiļevskis, Pierre Zweigenbaum
We describe the MAPA project, funded under the Connecting Europe Facility programme, whose goal is the development of an open-source de-identification toolkit for all official European Union languages.
no code implementations • EAMT 2020 • Laurent Bié, Aleix Cerdà-i-Cucó, Hans Degroote, Amando Estela, Mercedes García-Martínez, Manuel Herranz, Alejandro Kohan, Maite Melero, Tony O’Dowd, Sinéad O’Gorman, Mārcis Pinnis, Roberts Rozis, Riccardo Superbo, Artūrs Vasiļevskis
The Neural Translation for the European Union (NTEU) project aims to build a neural engine farm with all European official language combinations for eTranslation, without the necessity to use a high-resourced language as a pivot.
no code implementations • 3 Dec 2021 • Carlos Rodriguez-Penagos, Carme Armentano-Oller, Marta Villegas, Maite Melero, Aitor Gonzalez, Ona de Gibert Bonet, Casimiro Carrino Pio
The Catalan Language Understanding Benchmark (CLUB) encompasses various datasets representative of different NLU tasks that enable accurate evaluations of language models, following the General Language Understanding Evaluation (GLUE) example.
no code implementations • LREC 2022 • Ona de Gibert Bonet, Iakes Goenaga, Jordi Armengol-Estapé, Olatz Perez-de-Viñaspre, Carla Parra Escartín, Marina Sanchez, Mārcis Pinnis, Gorka Labaka, Maite Melero
In this work, we present the work that has been carried on in the MT4All CEF project and the resources that it has generated by leveraging recent research carried out in the field of unsupervised learning.
no code implementations • LREC 2022 • Ona de Gibert Bonet, Aitor García Pablos, Montse Cuadros, Maite Melero
In order to assess the quality of the generated datasets, we have used them to fine-tune a battery of entity-detection models, using as foundation different pre-trained language models: one multilingual, two general-domain monolingual and one in-domain monolingual.
no code implementations • SIGUL (LREC) 2022 • Ona de Gibert Bonet, Ksenia Kharitonova, Blanca Calvo Figueras, Jordi Armengol-Estapé, Maite Melero
In this work, we make the case of quality over quantity when training a MT system for a medium-to-low-resource language pair, namely Catalan-English.