Search Results for author: Djam{\'e} Seddah

Found 28 papers, 3 papers with code

Ubiquitous Usage of a Broad Coverage French Corpus: Processing the Est Republicain corpus

no code implementations LREC 2012 Djam{\'e} Seddah, C, Marie ito, Benoit Crabb{\'e}, Enrique Henestroza Anguiano

In this paper, we introduce a set of resources that we have derived from the EST R{\'E}PUBLICAIN CORPUS, a large, freely-available collection of regional newspaper articles in French, totaling 150 million words.

Dependency Parsing Lemmatization +1

Deep Syntax Annotation of the Sequoia French Treebank

no code implementations LREC 2014 C, Marie ito, Guy Perrier, Bruno Guillaume, Corentin Ribeyre, Kar{\"e}n Fort, Djam{\'e} Seddah, {\'E}ric de la Clergerie

We define a deep syntactic representation scheme for French, which abstracts away from surface syntactic variation and diathesis alternations, and describe the annotation of deep syntactic representations on top of the surface dependency trees of the Sequoia corpus.

Dependency Parsing

Accurate Deep Syntactic Parsing of Graphs: The Case of French

no code implementations LREC 2016 Corentin Ribeyre, Eric Villemonte de la Clergerie, Djam{\'e} Seddah

Parsing predicate-argument structures in a deep syntax framework requires graphs to be predicted.

ELMoLex: Connecting ELMo and Lexicon Features for Dependency Parsing

no code implementations CONLL 2018 Ganesh Jawahar, Benjamin Muller, Amal Fethi, Louis Martin, {\'E}ric Villemonte de la Clergerie, Beno{\^\i}t Sagot, Djam{\'e} Seddah

We augment the deep Biaffine (BiAF) parser (Dozat and Manning, 2016) with novel features to perform competitively: we utilize an indomain version of ELMo features (Peters et al., 2018) which provide context-dependent word representations; we utilize disambiguated, embedded, morphosyntactic features from lexicons (Sagot, 2018), which complements the existing feature set.

Dependency Parsing Language Modelling

What Does BERT Learn about the Structure of Language?

1 code implementation ACL 2019 Ganesh Jawahar, Beno{\^\i}t Sagot, Djam{\'e} Seddah

BERT is a recent language representation model that has surprisingly performed well in diverse language understanding benchmarks.

Contextualized Diachronic Word Representations

1 code implementation WS 2019 Ganesh Jawahar, Djam{\'e} Seddah

We devise a novel attentional model, based on Bernoulli word embeddings, that are conditioned on contextual extra-linguistic (social) features such as network, spatial and socio-economic variables, which are associated with Twitter users, as well as topic-based features.

Diachronic Word Embeddings Inductive Bias +1

Enhancing BERT for Lexical Normalization

no code implementations WS 2019 Benjamin Muller, Benoit Sagot, Djam{\'e} Seddah

In this article, focusing on User Generated Content (UGC), we study the ability of BERT to perform lexical normalisation.

Language Modelling Lexical Normalization

Phonetic Normalization for Machine Translation of User Generated Content

no code implementations WS 2019 Jos{\'e} Carlos Rosales N{\'u}{\~n}ez, Djam{\'e} Seddah, Guillaume Wisniewski

We present an approach to correct noisy User Generated Content (UGC) in French aiming to produce a pretreatement pipeline to improve Machine Translation for this kind of non-canonical corpora.

Language Modelling Machine Translation +1

Treebanking User-Generated Content: A Proposal for a Unified Representation in Universal Dependencies

no code implementations LREC 2020 Manuela Sanguinetti, Cristina Bosco, Lauren Cassidy, {\"O}zlem {\c{C}}etino{\u{g}}lu, Aless Cignarella, ra Teresa, Teresa Lynn, Ines Rehbein, Josef Ruppenhofer, Djam{\'e} Seddah, Amir Zeldes

The paper presents a discussion on the main linguistic phenomena of user-generated texts found in web and social media, and proposes a set of annotation guidelines for their treatment within the Universal Dependencies (UD) framework.

Les mod\`eles de langue contextuels Camembert pour le fran\ccais : impact de la taille et de l'h\'et\'erog\'en\'eit\'e des donn\'ees d'entrainement (C AMEM BERT Contextual Language Models for French: Impact of Training Data Size and Heterogeneity )

no code implementations JEPTALNRECITAL 2020 Louis Martin, Benjamin Muller, Pedro Javier Ortiz Su{\'a}rez, Yoann Dupont, Laurent Romary, {\'E}ric Villemonte de la Clergerie, Beno{\^\i}t Sagot, Djam{\'e} Seddah

L{'}utilisation pratique de ces mod{\`e}les {---} dans toutes les langues sauf l{'}anglais {---} {\'e}tait donc limit{\'e}e. La sortie r{\'e}cente de plusieurs mod{\`e}les monolingues fond{\'e}s sur BERT (Devlin et al., 2019), notamment pour le fran{\c{c}}ais, a d{\'e}montr{\'e} l{'}int{\'e}r{\^e}t de ces mod{\`e}les en am{\'e}liorant l{'}{\'e}tat de l{'}art pour toutes les t{\^a}ches {\'e}valu{\'e}es.

SENTS

Overview of the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies

no code implementations WS 2020 Gosse Bouma, Djam{\'e} Seddah, Daniel Zeman

This overview introduces the task of parsing into enhanced universal dependencies, describes the datasets used for training and evaluation, and evaluation metrics.

Cannot find the paper you are looking for? You can Submit a new open access paper.