no code implementations • LREC 2022 • Ona de Gibert Bonet, Aitor García Pablos, Montse Cuadros, Maite Melero
In order to assess the quality of the generated datasets, we have used them to fine-tune a battery of entity-detection models, using as foundation different pre-trained language models: one multilingual, two general-domain monolingual and one in-domain monolingual.
no code implementations • LREC 2022 • Elena Zotova, Montse Cuadros, German Rigau
For instance, spans manually annotated with IDs from UMLS can be annotated with Semantic Types and Groups, and its corresponding SNOMED CT and ICD-10 IDs.
no code implementations • EAMT 2020 • Ēriks Ajausks, Victoria Arranz, Laurent Bié, Aleix Cerdà-i-Cucó, Khalid Choukri, Montse Cuadros, Hans Degroote, Amando Estela, Thierry Etchegoyhen, Mercedes García-Martínez, Aitor García-Pablos, Manuel Herranz, Alejandro Kohan, Maite Melero, Mike Rosner, Roberts Rozis, Patrick Paroubek, Artūrs Vasiļevskis, Pierre Zweigenbaum
We describe the MAPA project, funded under the Connecting Europe Facility programme, whose goal is the development of an open-source de-identification toolkit for all official European Union languages.
no code implementations • LEGAL (LREC) 2022 • Victoria Arranz, Khalid Choukri, Montse Cuadros, Aitor García Pablos, Lucie Gianola, Cyril Grouin, Manuel Herranz, Patrick Paroubek, Pierre Zweigenbaum
This paper presents the outcomes of the MAPA project, a set of annotated corpora for 24 languages of the European Union and an open-source customisable toolkit able to detect and substitute sensitive information in text documents from any domain, using state-of-the art, deep learning-based named entity recognition techniques.
no code implementations • LREC 2020 • Salvador Lima Lopez, Naiara Perez, Laura Garc{\'\i}a-Sardi{\~n}a, Montse Cuadros
HitzalMed is a web-framed tool that performs automatic detection of sensitive information in clinical texts using machine learning algorithms reported to be competitive for the task.
1 code implementation • LREC 2020 • Salvador Lima, Naiara Perez, Montse Cuadros, German Rigau
This paper introduces the first version of the NUBes corpus (Negation and Uncertainty annotations in Biomedical texts in Spanish).
no code implementations • LREC 2020 • Aitor García-Pablos, Naiara Perez, Montse Cuadros
Massive digital data processing provides a wide range of opportunities and benefits, but at the cost of endangering personal data privacy.
2 code implementations • WS 2018 • Ona de Gibert, Naiara Perez, Aitor García-Pablos, Montse Cuadros
Hate speech is commonly defined as any communication that disparages a target group of people based on some characteristic such as race, colour, ethnicity, gender, sexual orientation, nationality, religion, or other characteristic.
no code implementations • LREC 2018 • Naiara Perez, Montse Cuadros, German Rigau
This paper presents a novel prototype for biomedical term normalization of electronic health record excerpts with the Unified Medical Language System (UMLS) Metathesaurus.
1 code implementation • 22 May 2017 • Aitor García-Pablos, Montse Cuadros, German Rigau
With the increase of online customer opinions in specialised websites and social networks, the necessity of automatic systems to help to organise and classify customer reviews by domain-specific aspect/categories and sentiment polarity is more important than ever.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2
no code implementations • EACL 2017 • Naiara Perez, Montse Cuadros
This paper describes a web-based application to design and answer exercises for language learning.
no code implementations • LREC 2016 • Aitor Garc{\'\i}a Pablos, Montse Cuadros, German Rigau
In basic Sentiment Analysis systems this sentiment polarity of the words is accounted and weighted in different ways to provide a degree of positivity/negativity.
no code implementations • LREC 2012 • Montse Cuadros, Llu{\'\i}s Padr{\'o}, German Rigau
Basically, the method applies a knowledge-based Word Sense Disambiguation algorithm to assign the most appropriate WordNet sense to large sets of topically related words acquired from the web, named TSWEB.