Search Results for author: Montse Cuadros

Found 14 papers, 3 papers with code

Spanish Datasets for Sensitive Entity Detection in the Legal Domain

no code implementations LREC 2022 Ona de Gibert Bonet, Aitor García Pablos, Montse Cuadros, Maite Melero

In order to assess the quality of the generated datasets, we have used them to fine-tune a battery of entity-detection models, using as foundation different pre-trained language models: one multilingual, two general-domain monolingual and one in-domain monolingual.

De-identification

ClinIDMap: Towards a Clinical IDs Mapping for Data Interoperability

no code implementations LREC 2022 Elena Zotova, Montse Cuadros, German Rigau

For instance, spans manually annotated with IDs from UMLS can be annotated with Semantic Types and Groups, and its corresponding SNOMED CT and ICD-10 IDs.

MAPA Project: Ready-to-Go Open-Source Datasets and Deep Learning Technology to Remove Identifying Information from Text Documents

no code implementations LEGAL (LREC) 2022 Victoria Arranz, Khalid Choukri, Montse Cuadros, Aitor García Pablos, Lucie Gianola, Cyril Grouin, Manuel Herranz, Patrick Paroubek, Pierre Zweigenbaum

This paper presents the outcomes of the MAPA project, a set of annotated corpora for 24 languages of the European Union and an open-source customisable toolkit able to detect and substitute sensitive information in text documents from any domain, using state-of-the art, deep learning-based named entity recognition techniques.

De-identification named-entity-recognition +2

HitzalMed: Anonymisation of Clinical Text in Spanish

no code implementations LREC 2020 Salvador Lima Lopez, Naiara Perez, Laura Garc{\'\i}a-Sardi{\~n}a, Montse Cuadros

HitzalMed is a web-framed tool that performs automatic detection of sensitive information in clinical texts using machine learning algorithms reported to be competitive for the task.

BIG-bench Machine Learning

NUBES: A Corpus of Negation and Uncertainty in Spanish Clinical Texts

1 code implementation LREC 2020 Salvador Lima, Naiara Perez, Montse Cuadros, German Rigau

This paper introduces the first version of the NUBes corpus (Negation and Uncertainty annotations in Biomedical texts in Spanish).

Negation

Hate Speech Dataset from a White Supremacy Forum

2 code implementations WS 2018 Ona de Gibert, Naiara Perez, Aitor García-Pablos, Montse Cuadros

Hate speech is commonly defined as any communication that disparages a target group of people based on some characteristic such as race, colour, ethnicity, gender, sexual orientation, nationality, religion, or other characteristic.

Hate Speech Detection Sentence

Biomedical term normalization of EHRs with UMLS

no code implementations LREC 2018 Naiara Perez, Montse Cuadros, German Rigau

This paper presents a novel prototype for biomedical term normalization of electronic health record excerpts with the Unified Medical Language System (UMLS) Metathesaurus.

W2VLDA: Almost Unsupervised System for Aspect Based Sentiment Analysis

1 code implementation22 May 2017 Aitor García-Pablos, Montse Cuadros, German Rigau

With the increase of online customer opinions in specialised websites and social networks, the necessity of automatic systems to help to organise and classify customer reviews by domain-specific aspect/categories and sentiment polarity is more important than ever.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings

no code implementations LREC 2016 Aitor Garc{\'\i}a Pablos, Montse Cuadros, German Rigau

In basic Sentiment Analysis systems this sentiment polarity of the words is accounted and weighted in different ways to provide a degree of positivity/negativity.

Sentiment Analysis Word Embeddings

Highlighting relevant concepts from Topic Signatures

no code implementations LREC 2012 Montse Cuadros, Llu{\'\i}s Padr{\'o}, German Rigau

Basically, the method applies a knowledge-based Word Sense Disambiguation algorithm to assign the most appropriate WordNet sense to large sets of topically related words acquired from the web, named TSWEB.

Word Sense Disambiguation

Cannot find the paper you are looking for? You can Submit a new open access paper.