Enhancing the PARSEME Turkish Corpus of Verbal Multiword Expressions

no code implementations LREC (MWE) 2022 Yagmur Ozturk, Najet Hadj Mohamed, Adam Lion-Bouton, Agata Savary

We provide an overview of the problems observed in the morphosyntactic annotation of the Turkish PARSEME corpus.

Polish corpus of verbal multiword expressions

1 code implementation COLING (MWE) 2020 Agata Savary, Jakub Waszczuk

This paper describes a manually annotated corpus of verbal multi-word expressions in Polish.

Evaluating Diversity of Multiword Expressions in Annotated Text

no code implementations COLING 2022 Adam Lion-Bouton, Yagmur Ozturk, Agata Savary, Jean-Yves Antoine

We apply the validated measures to annotations in 14 languages produced by systems during the PARSEME shared task on automatic identification of multiword expressions and on the gold versions of the corpora.

Diversity Lemmatization

Seen2Unseen at PARSEME Shared Task 2020: All Roads do not Lead to Unseen Verb-Noun VMWEs

no code implementations COLING (MWE) 2020 Caroline Pasquer, Agata Savary, Carlos Ramisch, Jean-Yves Antoine

We describe the Seen2Unseen system that participated in edition 1. 2 of the PARSEME shared task on automatic identification of verbal multiword expressions (VMWEs).


Formalising lexical and syntactic diversity for data sampling in French

no code implementations14 Jan 2025 Louis Estève, Manon Scholivet, Agata Savary

Diversity is an important property of datasets and sampling data for diversity is useful in dataset creation.


Verbal Multiword Expression Identification: Do We Need a Sledgehammer to Crack a Nut?

no code implementations COLING 2020 Caroline Pasquer, Agata Savary, Carlos Ramisch, Jean-Yves Antoine

Automatic identification of multiword expressions (MWEs), like {`}to cut corners{'} (to do an incomplete job), is a pre-requisite for semantically-oriented downstream applications.

To Be or Not To Be a Verbal Multiword Expression: A Quest for Discriminating Features

no code implementations22 Jul 2020 Caroline Pasquer, Agata Savary, Jean-Yves Antoine, Carlos Ramisch, Nicolas Labroche, Arnaud Giacometti

We use this fact to determine the optimal set of features which could be used in a supervised classification setting to solve a subproblem of VMWE identification: the identification of occurrences of previously seen VMWEs.

feature selection

DOING@DEFT : cascade de CRF pour l'annotation d'entit\'es cliniques imbriqu\'ees (DOING@DEFT : cascade of CRF for the annotation of nested clinical entities)

no code implementations JEPTALNRECITAL 2020 Anne-Lyse Minard, Andr{\'e}ane Roques, Nicolas Hiot, Mirian Halfeld Ferrari Alves, Agata Savary

Cet article pr{\'e}sente le syst{\`e}me d{\'e}velopp{\'e} par l{'}{\'e}quipe DOING pour la campagne d{'}{\'e}valuation DEFT 2020 portant sur la similarit{\'e} s{\'e}mantique et l{'}extraction d{'}information fine.

Without lexicons, multiword expression identification will never fly: A position statement

no code implementations WS 2019 Agata Savary, Silvio Cordeiro, Carlos Ramisch

Because most multiword expressions (MWEs), especially verbal ones, are semantically non-compositional, their automatic identification in running text is a prerequisite for semantically-oriented downstream applications.


VarIDE at PARSEME Shared Task 2018: Are Variants Really as Alike as Two Peas in a Pod?

no code implementations COLING 2018 Caroline Pasquer, Carlos Ramisch, Agata Savary, Jean-Yves Antoine

We describe the VarIDE system (standing for Variant IDEntification) which participated in the edition 1. 1 of the PARSEME shared task on automatic identification of verbal multiword expressions (VMWEs).

If you've seen some, you've seen them all: Identifying variants of multiword expressions

no code implementations COLING 2018 Caroline Pasquer, Agata Savary, Carlos Ramisch, Jean-Yves Antoine

Multiword expressions, especially verbal ones (VMWEs), show idiosyncratic variability, which is challenging for NLP applications, hence the need for VMWE identification.

General Classification

Towards a Variability Measure for Multiword Expressions

no code implementations NAACL 2018 Caroline Pasquer, Agata Savary, Jean-Yves Antoine, Carlos Ramisch

One of the most outstanding properties of multiword expressions (MWEs), especially verbal ones (VMWEs), important both in theoretical models and applications, is their idiosyncratic variability.

Annotation d'expressions polylexicales verbales en fran\ccais (Annotation of verbal multiword expressions in French)

no code implementations JEPTALNRECITAL 2017 C, Marie ito, Mathieu Constant, Carlos Ramisch, Agata Savary, Yannick Parmentier, Caroline Pasquer, Jean-Yves Antoine

Nous d{\'e}crivons la partie fran{\c{c}}aise des donn{\'e}es produites dans le cadre de la campagne multilingue PARSEME sur l{'}identification d{'}expressions polylexicales verbales (Savary et al., 2017).

Projecting Multiword Expression Resources on a Polish Treebank

no code implementations WS 2017 Agata Savary, Jakub Waszczuk

Multiword expressions (MWEs) are linguistic objects containing two or more words and showing idiosyncratic behavior at different levels.

Promoting multiword expressions in A* TAG parsing

no code implementations COLING 2016 Jakub Waszczuk, Agata Savary, Yannick Parmentier

Multiword expressions (MWEs) are pervasive in natural languages and often have both idiomatic and compositional readings, which leads to high syntactic ambiguity.


PARSEME Survey on MWE Resources

no code implementations LREC 2016 Gyri Sm{\o}rdal Losnegaard, Federico Sangati, Carla Parra Escart{\'\i}n, Agata Savary, Sascha Bargmann, Johanna Monti

We also discuss the problems we have detected upon examination of the data as well as possible ways of enhancing the survey.


Towards Lexical Encoding of Multi-Word Expressions in Spanish Dialects

no code implementations LREC 2016 Diana Bogantes, Eric Rodr{\'\i}guez, Alej Arauco, Alej Rodr{\'\i}guez, ro, Agata Savary

This paper describes a pilot study in lexical encoding of multi-word expressions (MWEs) in 4 Latin American dialects of Spanish: Costa Rican, Colombian, Mexican and Peruvian.

Polish Coreference Corpus in Numbers

no code implementations LREC 2014 Maciej Ogrodniczuk, Mateusz Kope{\'c}, Agata Savary

Correlation between cluster and mention count within a text is investigated, with short characteristics of outlier cases.

Clustering coreference-resolution +2

