no code implementations • LREC 2022 • Nicolas Zampieri, Carlos Ramisch, Irina Illina, Dominique Fohr
In this article, we present joint experiments on these two related tasks on English Twitter data: first we focus on the MWE identification task, and then we observe the influence of MWE-based features on the HSD task.
no code implementations • LREC (MWE) 2022 • Fernando Zagatti, Paulo Augusto de Lima Medeiros, Esther da Cunha Soares, Lucas Nildaimon dos Santos Silva, Carlos Ramisch, Livy Real
One of the contributions of our work is the adaptation of the MWE extraction pipeline from the mwetoolkit, allowing its usage in python development environments and integration in larger pipelines.
no code implementations • COLING (MWE) 2020 • Caroline Pasquer, Agata Savary, Carlos Ramisch, Jean-Yves Antoine
We describe the Seen2Unseen system that participated in edition 1. 2 of the PARSEME shared task on automatic identification of verbal multiword expressions (VMWEs).
no code implementations • COLING (MWE) 2020 • Carlos Ramisch, Agata Savary, Bruno Guillaume, Jakub Waszczuk, Marie Candito, Ashwini Vaidya, Verginica Barbu Mititelu, Archna Bhatia, Uxoa Iñurrieta, Voula Giouli, Tunga Güngör, Menghan Jiang, Timm Lichte, Chaya Liebeskind, Johanna Monti, Renata Ramisch, Sara Stymne, Abigail Walsh, Hongzhi Xu
We present edition 1. 2 of the PARSEME shared task on identification of verbal multiword expressions (VMWEs).
no code implementations • JEP/TALN/RECITAL 2022 • Nicolas Zampieri, Carlos Ramisch, Irina Illina, Dominique Fohr
L’identification des expressions polylexicales (EP) dans les tweets est une tâche difficile en raison de la nature linguistique complexe des EP combinée à l’utilisation d’un langage non standard.
1 code implementation • ACL (CASE) 2021 • Léo Bouscarrat, Antoine Bonnefoy, Cécile Capponi, Carlos Ramisch
This paper explains our participation in task 1 of the CASE 2021 shared task.
no code implementations • COLING 2020 • Caroline Pasquer, Agata Savary, Carlos Ramisch, Jean-Yves Antoine
Automatic identification of multiword expressions (MWEs), like {`}to cut corners{'} (to do an incomplete job), is a pre-requisite for semantically-oriented downstream applications.
no code implementations • COLING 2020 • Cindy Aloui, Carlos Ramisch, Alexis Nasr, Lucie Barque
Contextualised embeddings such as BERT have become de facto state-of-the-art references in many NLP applications, thanks to their impressive performances.
no code implementations • 22 Jul 2020 • Caroline Pasquer, Agata Savary, Jean-Yves Antoine, Carlos Ramisch, Nicolas Labroche, Arnaud Giacometti
We use this fact to determine the optimal set of features which could be used in a supervised classification setting to solve a subproblem of VMWE identification: the identification of occurrences of previously seen VMWEs.
2 code implementations • LREC 2020 • Léo Bouscarrat, Antoine Bonnefoy, Cécile Capponi, Carlos Ramisch
Translating biomedical ontologies is an important challenge, but doing it manually requires much time and money.
no code implementations • WS 2019 • Agata Savary, Silvio Cordeiro, Carlos Ramisch
Because most multiword expressions (MWEs), especially verbal ones, are semantically non-compositional, their automatic identification in running text is a prerequisite for semantically-oriented downstream applications.
no code implementations • WS 2019 • Nicolas Zampieri, Carlos Ramisch, Geraldine Damnati
Recent initiatives such as the PARSEME shared task allowed the rapid development of MWE identification systems.
no code implementations • NAACL 2019 • Manon Scholivet, Franck Dary, Alexis Nasr, Benoit Favre, Carlos Ramisch
The existence of universal models to describe the syntax of languages has been debated for decades.
no code implementations • CL 2019 • Silvio Cordeiro, Aline Villavicencio, Marco Idiart, Carlos Ramisch
General crosslingual analyses reveal the impact of morphological variation and corpus size in the ability of the model to predict compositionality, and of a uniform combination of the components for best results.
1 code implementation • COLING 2018 • Nicolas Zampieri, Manon Scholivet, Carlos Ramisch, Benoit Favre
This paper describes the Veyn system, submitted to the closed track of the PARSEME Shared Task 2018 on automatic identification of verbal multiword expressions (VMWEs).
no code implementations • COLING 2018 • Carlos Ramisch, Silvio Ricardo Cordeiro, Agata Savary, Veronika Vincze, Verginica Barbu Mititelu, Archna Bhatia, Maja Buljan, C, Marie ito, Polona Gantar, Voula Giouli, Tunga G{\"u}ng{\"o}r, Abdelati Hawwari, Uxoa I{\~n}urrieta, Jolanta Kovalevskait{\.e}, Simon Krek, Timm Lichte, Chaya Liebeskind, Johanna Monti, Carla Parra Escart{\'\i}n, Behrang Qasemizadeh, Renata Ramisch, Nathan Schneider, Ivelina Stoyanova, Ashwini Vaidya, Abigail Walsh
Corpora were created for 20 languages, which are also briefly discussed.
no code implementations • COLING 2018 • Caroline Pasquer, Carlos Ramisch, Agata Savary, Jean-Yves Antoine
We describe the VarIDE system (standing for Variant IDEntification) which participated in the edition 1. 1 of the PARSEME shared task on automatic identification of verbal multiword expressions (VMWEs).
no code implementations • COLING 2018 • Caroline Pasquer, Agata Savary, Carlos Ramisch, Jean-Yves Antoine
Multiword expressions, especially verbal ones (VMWEs), show idiosyncratic variability, which is challenging for NLP applications, hence the need for VMWE identification.
no code implementations • NAACL 2018 • Caroline Pasquer, Agata Savary, Jean-Yves Antoine, Carlos Ramisch
One of the most outstanding properties of multiword expressions (MWEs), especially verbal ones (VMWEs), important both in theoretical models and applications, is their idiosyncratic variability.
no code implementations • CL 2017 • Mathieu Constant, G{\"u}l{\c{s}}en Eryi{\v{g}}it, Johanna Monti, Lonneke van der Plas, Carlos Ramisch, Michael Rosner, Amalia Todirascu
The structure of linguistic processing that depends on the clear distinction between words and phrases has to be re-thought to accommodate MWEs.
no code implementations • JEPTALNRECITAL 2017 • C, Marie ito, Mathieu Constant, Carlos Ramisch, Agata Savary, Yannick Parmentier, Caroline Pasquer, Jean-Yves Antoine
Nous d{\'e}crivons la partie fran{\c{c}}aise des donn{\'e}es produites dans le cadre de la campagne multilingue PARSEME sur l{'}identification d{'}expressions polylexicales verbales (Savary et al., 2017).
no code implementations • WS 2017 • Agata Savary, Carlos Ramisch, Silvio Cordeiro, Federico Sangati, Veronika Vincze, Behrang Qasemizadeh, C, Marie ito, Fabienne Cap, Voula Giouli, Ivelina Stoyanova, Antoine Doucet
This paper presents the corpus annotation methodology and outcome, the shared task organisation and the results of the participating systems.
no code implementations • WS 2017 • Natalie Vargas, Carlos Ramisch, Helena Caseli
We propose a method for joint unsupervised discovery of multiword expressions (MWEs) and their translations from parallel corpora.
no code implementations • WS 2017 • Manon Scholivet, Carlos Ramisch
We present a simple and efficient tagger capable of identifying highly ambiguous multiword expressions (MWEs) in French texts.
no code implementations • LREC 2016 • Carlos Ramisch, Alexis Nasr, Andr{\'e} Valli, Jos{\'e} Deulofeu
We introduce DeQue, a lexicon covering French complex prepositions (CPRE) like {``}{\`a} partir de{''} (from) and complex conjunctions (CCONJ) like {``}bien que{''} (although).
no code implementations • LREC 2016 • Silvio Cordeiro, Carlos Ramisch, Aline Villavicencio
This paper presents mwetoolkit+sem: an extension of the mwetoolkit that estimates semantic compositionality scores for multiword expressions (MWEs) based on word embeddings.
no code implementations • LREC 2014 • Muntsa Padr{\'o}, Marco Idiart, Aline Villavicencio, Carlos Ramisch
Distributional thesauri have been applied for a variety of tasks involving semantic relatedness.
no code implementations • LREC 2014 • Bruno Laranjeira, Viviane Moreira, Aline Villavicencio, Carlos Ramisch, Maria Jos{\'e} Finatto
Comparable corpora have been used as an alternative for parallel corpora as resources for computational tasks that involve domain-specific natural language processing.