no code implementations • TDLE (LREC) 2022 • Iria de-Dios-Flores, Carmen Magariños, Adina Ioana Vladu, John E. Ortega, José Ramom Pichel, Marcos García, Pablo Gamallo, Elisa Fernández Rei, Alberto Bugarín-Diz, Manuel González González, Senén Barro, Xosé Luis Regueira
The development of language technologies (LTs) such as machine translation, text analytics, and dialogue systems is essential in the current digital society, culture and economy.
Cultural Vocal Bursts Intensity Prediction Machine Translation +1
no code implementations • 1 Dec 2023 • Pablo Gamallo
In this paper we propose a transparent, interpretable, and linguistically motivated strategy for encoding the contextual sense of words by modeling semantic compositionality.
no code implementations • 1 Apr 2023 • Miguel Cavadas, Pablo Gamallo
Automatic Authorship Attribution (AAA) is the result of applying tools and techniques from Digital Humanities to authorship attribution studies.
no code implementations • SEMEVAL 2020 • Pablo Gamallo
This article describes some unsupervised strategies submitted to SemEval 2020 Task 3, a task which consists of considering the effect of context to compute word similarity.
no code implementations • CL 2019 • Pablo Gamallo, Susana Sotelo, Jos{\'e} Ramom Pichel, Mikel Artetxe
The contextualization of meaning is carried out by means of distributional composition within a structured vector space with syntactic dependencies, and the bilingual space is created by means of transfer rules and a bilingual dictionary.
no code implementations • WS 2019 • Pablo Gamallo, Marcos Garcia
This article describes a dependency-based strategy that uses compositional distributional semantics and cross-lingual word embeddings to translate multiword expressions (MWEs).
no code implementations • SEMEVAL 2019 • Sattam Almatarneh, Pablo Gamallo, Francisco J. Ribadas Pena
This article describes the strategy submitted by the CiTIUS-COLE team to SemEval 2019 Task 5, a task which consists of binary classi- fication where the system predicting whether a tweet in English or in Spanish is hateful against women or immigrants or not.
1 code implementation • COLING 2018 • Jose Ramom Pichel Campos, Pablo Gamallo, I{\~n}aki Alegria
In our approach, we used a perplexity-based measure to calculate language distance between all the historical periods of a specific language: European Portuguese.
no code implementations • SEMEVAL 2018 • Pablo Gamallo
This article describes the unsupervised strategy submitted by the CitiusNLP team to the SemEval 2018 Task 10, a task which consists of predict whether a word is a discriminative attribute between two other words.
no code implementations • SEMEVAL 2017 • Pablo Gamallo
This article describes the distributional strategy submitted by the Citius team to the SemEval 2017 Task 2.
no code implementations • CONLL 2017 • Marcos Garcia, Pablo Gamallo
We also compare our system with a delexicalized parser for Romance languages, and take advantage of the harmonized annotation of Universal Dependencies to propose a language ranking based on the syntactic distance each variety has from Romance languages.
no code implementations • WS 2017 • Pablo Gamallo
The main contribution of the article is to expand their model to a fully compositional framework in which syntactic dependencies are put at the core of semantic composition.
no code implementations • WS 2017 • Pablo Gamallo, Mart{\'\i}n Pereira-Fari{\~n}a
This article describes a method to build semantic representations of composite expressions in a compositional way by using WordNet relations to represent the meaning of words.
no code implementations • WS 2017 • Pablo Gamallo, Jose Ramom Pichel, I{\~n}aki Alegria
This article describes the system submitted by the Citius{\_}Ixa{\_}Imaxin team to the VarDial 2017 (DSL and GDI tasks).
no code implementations • EACL 2017 • Pablo Gamallo, Iv{\'a}n Rodr{\'\i}guez-Torres, Marcos Garcia
This article describes a semantic system which is based on distributional models obtained from a chronologically structured language resource, namely Google Books Syntactic Ngrams. The models were created using dependency-based contexts and a strategy for reducing the vector space, which consists in selecting the more informative and relevant word contexts.
no code implementations • WS 2016 • Pablo Gamallo, I{\~n}aki Alegria, Jos{\'e} Ramom Pichel, Manex Agirrezabal
This article describes the systems submitted by the Citius{\_}Ixa{\_}Imaxin team to the Discriminating Similar Languages Shared Task 2016.
Automatic Speech Recognition (ASR) General Classification +2
no code implementations • LREC 2016 • I{\~n}aki San Vicente, I{\~n}aki Alegr{\'\i}a, Cristina Espa{\~n}a-Bonet, Pablo Gamallo, Hugo Gon{\c{c}}alo Oliveira, Eva Mart{\'\i}nez Garcia, Antonio Toral, Arkaitz Zubiaga, Nora Aranberri
We introduce TweetMT, a parallel corpus of tweets in four language pairs that combine five languages (Spanish from/to Basque, Catalan, Galician and Portuguese), all of which have an official status in the Iberian Peninsula.
no code implementations • LREC 2014 • I{\~n}aki Alegria, Nora Aranberri, Pere Comas, V{\'\i}ctor Fresno, Pablo Gamallo, Lluis Padr{\'o}, I{\~n}aki San Vicente, Jordi Turmo, Arkaitz Zubiaga
It was created for Tweet-Norm, a tweet normalization workshop and shared task, and is the result of a joint annotation effort from different research groups.
no code implementations • LREC 2014 • Marcos Garcia, Pablo Gamallo
This paper presents three corpora with coreferential annotation of person entities for Portuguese, Galician and Spanish.