1 code implementation • SemEval (NAACL) 2022 • Harish Tayyar Madabushi, Edward Gow-Smith, Marcos Garcia, Carolina Scarton, Marco Idiart, Aline Villavicencio
This paper presents the shared task on Multilingual Idiomaticity Detection and Sentence Embedding, which consists of two subtasks: (a) a binary classification task aimed at identifying whether a sentence contains an idiomatic expression, and (b) a task based on semantic text similarity which requires the model to adequately represent potentially idiomatic expressions in context.
1 code implementation • ACL 2021 • Marcos Garcia, Tiago Kramer Vieira, Carolina Scarton, Marco Idiart, Aline Villavicencio
This paper presents the Noun Compound Type and Token Idiomaticity (NCTTI) dataset, with human annotations for 280 noun compounds in English and 180 in Portuguese at both type and token level.
1 code implementation • EACL 2021 • Marcos Garcia, Tiago Kramer Vieira, Carolina Scarton, Marco Idiart, Aline Villavicencio
Contextualised word representation models have been successfully used for capturing different word usages and they may be an attractive alternative for representing idiomaticity in language.
no code implementations • 29 Mar 2021 • Gustavo Soroka, Marco Idiart
Brain oscillations are believed to be involved in the different operations necessary to manipulate information during working memory tasks.
no code implementations • CL 2019 • Silvio Cordeiro, Aline Villavicencio, Marco Idiart, Carlos Ramisch
General crosslingual analyses reveal the impact of morphological variation and corpus size in the ability of the model to predict compositionality, and of a uniform combination of the components for best results.
no code implementations • NAACL 2018 • Felipe Paula, Rodrigo Wilkens, Marco Idiart, Aline Villavicencio
Semantic Verbal Fluency tests have been used in the detection of certain clinical conditions, like Dementia.
1 code implementation • 3 Jun 2016 • Alexandre Salle, Marco Idiart, Aline Villavicencio
The effectiveness of both modifications is shown using word similarity and analogy tasks.
1 code implementation • ACL 2016 • Alexandre Salle, Marco Idiart, Aline Villavicencio
In this paper, we propose LexVec, a new method for generating distributed word representations that uses low-rank, weighted factorization of the Positive Point-wise Mutual Information matrix via stochastic gradient descent, employing a weighting scheme that assigns heavier penalties for errors on frequent co-occurrences while still accounting for negative co-occurrence.
no code implementations • LREC 2016 • Rodrigo Wilkens, Marco Idiart, Aline Villavicencio
Focusing on compound nouns (CN), we then verify in a longitudinal study if there are differences in the distribution and compositionality of CNs in child-directed and child-produced sentences across ages.
no code implementations • LREC 2014 • Muntsa Padr{\'o}, Marco Idiart, Aline Villavicencio, Carlos Ramisch
Distributional thesauri have been applied for a variety of tasks involving semantic relatedness.
no code implementations • LREC 2012 • Aline Villavicencio, Beracah Yankama, Marco Idiart, Robert Berwick
This paper describes such an initiative for combining information from various sources to extend the annotation of the English CHILDES corpora with linguistic, psycholinguistic and distributional information, along with an example illustrating an application of this approach to the extraction of verb alternation information.