Search Results for author: Djamé Seddah

Found 19 papers, 8 papers with code

Can Character-based Language Models Improve Downstream Task Performances In Low-Resource And Noisy Language Scenarios?

no code implementations WNUT (ACL) 2021 Arij Riabi, Benoît Sagot, Djamé Seddah

Recent impressive improvements in NLP, largely based on the success of contextual neural language models, have been mostly demonstrated on at most a couple dozen high- resource languages.

Dependency Parsing Language Modelling +1

Comparison between NMT and PBSMT Performance for Translating Noisy User-Generated Content

no code implementations WS (NoDaLiDa) 2019 José Carlos Rosales Núñez, Djamé Seddah, Guillaume Wisniewski

This work compares the performances achieved by Phrase-Based Statistical Machine Translation systems (PB-SMT) and attention-based Neuronal Machine Translation systems (NMT) when translating User Generated Content (UGC), as encountered in social medias, from French to English.

Machine Translation Translation

From Raw Text to Enhanced Universal Dependencies: The Parsing Shared Task at IWPT 2021

no code implementations ACL (IWPT) 2021 Gosse Bouma, Djamé Seddah, Daniel Zeman

We describe the second IWPT task on end-to-end parsing from raw text to Enhanced Universal Dependencies.

Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?

no code implementations26 Oct 2021 Arij Riabi, Benoît Sagot, Djamé Seddah

Recent impressive improvements in NLP, largely based on the success of contextual neural language models, have been mostly demonstrated on at most a couple dozen high-resource languages.

Dependency Parsing Language Modelling +1

Understanding the Impact of UGC Specificities on Translation Quality

no code implementations WNUT (ACL) 2021 José Carlos Rosales Núñez, Djamé Seddah, Guillaume Wisniewski

This work takes a critical look at the evaluation of user-generated content automatic translation, the well-known specificities of which raise many challenges for MT.

Translation

Noisy UGC Translation at the Character Level: Revisiting Open-Vocabulary Capabilities and Robustness of Char-Based Models

1 code implementation WNUT (ACL) 2021 José Carlos Rosales Núñez, Guillaume Wisniewski, Djamé Seddah

This work explores the capacities of character-based Neural Machine Translation to translate noisy User-Generated Content (UGC) with a strong focus on exploring the limits of such approaches to handle productive UGC phenomena, which almost by definition, cannot be seen at training time.

Machine Translation Translation

PAGnol: An Extra-Large French Generative Model

no code implementations16 Oct 2021 Julien Launay, E. L. Tommasone, Baptiste Pannier, François Boniface, Amélie Chatelain, Alessandro Cappelli, Iacopo Poli, Djamé Seddah

We fit a scaling law for compute for the French language, and compare it with its English counterpart.

First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT

no code implementations EACL 2021 Benjamin Muller, Yanai Elazar, Benoît Sagot, Djamé Seddah

Such transfer emerges by fine-tuning on a task of interest in one language and evaluating on a distinct language, not seen during the fine-tuning.

Fine-tuning Language Modelling +1

Disentangling semantics in language through VAEs and a certain architectural choice

1 code implementation24 Dec 2020 Ghazi Felhi, Joseph Le Roux, Djamé Seddah

We present an unsupervised method to obtain disentangled representations of sentences that single out semantic content.

Open Information Extraction

Sentence-Based Model Agnostic NLP Interpretability

1 code implementation24 Dec 2020 Yves Rychener, Xavier Renard, Djamé Seddah, Pascal Frossard, Marcin Detyniecki

Today, interpretability of Black-Box Natural Language Processing (NLP) models based on surrogates, like LIME or SHAP, uses word-based sampling to build the explanations.

Treebanking User-Generated Content: a UD Based Overview of Guidelines, Corpora and Unified Recommendations

no code implementations3 Nov 2020 Manuela Sanguinetti, Lauren Cassidy, Cristina Bosco, Özlem Çetinoğlu, Alessandra Teresa Cignarella, Teresa Lynn, Ines Rehbein, Josef Ruppenhofer, Djamé Seddah, Amir Zeldes

This article presents a discussion on the main linguistic phenomena which cause difficulties in the analysis of user-generated texts found on the web and in social media, and proposes a set of annotation guidelines for their treatment within the Universal Dependencies (UD) framework of syntactic analysis.

Cannot find the paper you are looking for? You can Submit a new open access paper.