Search Results for author: Asier Gutiérrez-Fandiño

Found 12 papers, 8 papers with code

Pretrained Biomedical Language Models for Clinical NLP in Spanish

1 code implementation BioNLP (ACL) 2022 Casimiro Pio Carrino, Joan Llop, Marc Pàmies, Asier Gutiérrez-Fandiño, Jordi Armengol-Estapé, Joaquín Silveira-Ocampo, Alfonso Valencia, Aitor Gonzalez-Agirre, Marta Villegas

This work presents the first large-scale biomedical Spanish language models trained from scratch, using large biomedical corpora consisting of a total of 1. 1B tokens and an EHR corpus of 95M tokens.

NER

esCorpius: A Massive Spanish Crawling Corpus

no code implementations30 Jun 2022 Asier Gutiérrez-Fandiño, David Pérez-Fernández, Jordi Armengol-Estapé, David Griol, Zoraida Callejas

However, the results in Spanish present important shortcomings, as they are either too small in comparison with other languages, or present a low quality derived from sub-optimal cleaning and deduplication.

Language Modelling

FinEAS: Financial Embedding Analysis of Sentiment

1 code implementation31 Oct 2021 Asier Gutiérrez-Fandiño, Miquel Noguer i Alonso, Petter Kolm, Jordi Armengol-Estapé

We introduce a new language representation model in finance called Financial Embedding Analysis of Sentiment (FinEAS).

Sentence Sentence Embeddings +4

Spanish Legalese Language Model and Corpora

1 code implementation23 Oct 2021 Asier Gutiérrez-Fandiño, Jordi Armengol-Estapé, Aitor Gonzalez-Agirre, Marta Villegas

There are many Language Models for the English language according to its worldwide relevance.

Language Modelling

Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models

no code implementations16 Sep 2021 Casimiro Pio Carrino, Jordi Armengol-Estapé, Ona de Gibert Bonet, Asier Gutiérrez-Fandiño, Aitor Gonzalez-Agirre, Martin Krallinger, Marta Villegas

We introduce CoWeSe (the Corpus Web Salud Espa\~nol), the largest Spanish biomedical corpus to date, consisting of 4. 5GB (about 750M tokens) of clean plain text.

Characterizing and Measuring the Similarity of Neural Networks with Persistent Homology

1 code implementation NeurIPS 2021 David Pérez-Fernández, Asier Gutiérrez-Fandiño, Jordi Armengol-Estapé, Marta Villegas

Characterizing the structural properties of neural networks is crucial yet poorly understood, and there are no well-established similarity measures between networks.

Topological Data Analysis

A Vulnerability Study on Academic Collaboration Networks Based on Network Dynamics

1 code implementation21 Dec 2020 Asier Gutiérrez-Fandiño, Jordi Armengol-Estapé, Marta Villegas

Email can be one of the most fruitful attack vectors of research institutions as they also contain access to all accounts and thus to all private information.

Cryptography and Security Social and Information Networks

Cannot find the paper you are looking for? You can Submit a new open access paper.