Search Results for author: Ga{\"e}l Lejeune

Found 20 papers, 0 papers with code

Vers un diagnostic d'ambigu\"\it\'e des termes candidats d'un texte

no code implementations JEPTALNRECITAL 2015 Ga{\"e}l Lejeune, B{\'e}atrice Daille

Dans cet article, nous nous int{\'e}ressons {\`a} l{'}ambigu{\"\i}t{\'e} d{'}un terme en domaine de sp{\'e}cialit{\'e}.

\'Evaluation intrins\`eque et extrins\`eque du nettoyage de pages Web

no code implementations JEPTALNRECITAL 2015 Ga{\"e}l Lejeune, Romain Brixtel, Charlotte Lecluze

Nous proposons deux types d{'}{\'e}valuation de cette t{\^a}che de d{\'e}tourage : (I) une {\'e}valuation intrins{\`e}que fond{\'e}e sur le contenu en mots, balises et caract{\`e}res ; (II) une {\'e}valuation extrins{\`e}que fond{\'e}e sur la t{\^a}che, en examinant l{'}effet du d{\'e}tourage des documents sur le syst{\`e}me plac{\'e} en aval de la cha{\^\i}ne de traitement.

Ambiguity Diagnosis for Terms in Digital Humanities

no code implementations LREC 2016 B{\'e}atrice Daille, Evelyne Jacquey, Ga{\"e}l Lejeune, Luis Felipe Melo, Yannick Toussaint

If a lexical unit is indeed a term of the domain, it is not true, even in a specialised corpus, that all its occurrences are terminological.

Word Sense Disambiguation

Character Based Pattern Mining for Neology Detection

no code implementations WS 2017 Ga{\"e}l Lejeune, Emmanuel Cartier

In this paper, neology detection is considered as a classification task where a system has to assess whether a given lexical item is an actual neologism or not.

General Classification

Mod\`eles en Caract\`eres pour la D\'etection de Polarit\'e dans les Tweets (Character-level Models for Polarity Detection in Tweets )

no code implementations JEPTALNRECITAL 2018 Davide Buscaldi, Joseph Le Roux, Ga{\"e}l Lejeune

Notre premi{\`e}re m{\'e}thode est fond{\'e}e sur des lexiques (mots et emojis), les n-grammes de caract{\`e}res et un classificateur {\`a} vaste marge (ou SVM).

Indexation et appariements de documents cliniques pour le Deft 2019 (Indexing and pairing texts of the medical domain )

no code implementations JEPTALNRECITAL 2019 Davide Buscaldi, Dhaou Ghoul, Joseph Le Roux, Ga{\"e}l Lejeune

Pour la ta{\^c}he d{'}indexation nous avons test{\'e} deux m{\'e}thodes, une fond{\'e}e sur l{'}appariemetn pr{\'e}alable des documents du jeu de tset avec les documents du jeu d{'}entra{\^\i}nement et une autre m{\'e}thode fond{\'e}e sur l{'}annotation terminologique.

MICHAEL: Mining Character-level Patterns for Arabic Dialect Identification (MADAR Challenge)

no code implementations WS 2019 Dhaou Ghoul, Ga{\"e}l Lejeune

We present MICHAEL, a simple lightweight method for automatic Arabic Dialect Identification on the MADAR travel domain Dialect Identification (DID).

Dialect Identification General Classification

Dating Ancient texts: an Approach for Noisy French Documents

no code implementations LREC 2020 Ana{\"e}lle Baledent, Nicolas Hiebel, Ga{\"e}l Lejeune

The experiments presented in this article focused on documents written in French but we believe that the ability of character-level models to handle noise properly would help to achieve comparable results on other languages and more ancient languages in particular.

Document Dating POS

A Dataset for Multi-lingual Epidemiological Event Extraction

no code implementations LREC 2020 Stephen Mutuvi, Antoine Doucet, Ga{\"e}l Lejeune, Moses Odeo

This paper proposes a corpus for the development and evaluation of tools and techniques for identifying emerging infectious disease threats in online news text.

Event Extraction text-classification +1

Out-of-the-Box and into the Ditch? Multilingual Evaluation of Generic Text Extraction Tools

no code implementations LREC 2020 Adrien Barbaresi, Ga{\"e}l Lejeune

This article examines extraction methods designed to retain the main text content of web pages and discusses how the extraction could be oriented and evaluated: can and should it be as generic as possible to ensure opportunistic corpus construction?

Que rec\`elent les donn\'ees textuelles issues du web ? (What do text data from the Web have to hide ?)

no code implementations JEPTALNRECITAL 2020 Adrien Barbaresi, Ga{\"e}l Lejeune

La collecte et l{'}usage opportunistes de donn{\'e}es textuelles tir{\'e}es du web sont sujets {\`a} une s{\'e}rie de probl{\`e}mes {\'e}thiques, m{\'e}thodologiques et {\'e}pist{\'e}mologiques qui m{\'e}ritent l{'}attention de la communaut{\'e} scientifique.

Calcul de similarit\'e entre phrases : quelles mesures et quels descripteurs ? (Sentence Similarity : a study on similarity metrics with words and character strings )

no code implementations JEPTALNRECITAL 2020 Davide Buscaldi, Ghazi Felhi, Dhaou Ghoul, Joseph Le Roux, Ga{\"e}l Lejeune, Xu-Dong Zhang

Dans notre travail nous nous sommes int{\'e}ress{\'e} {\`a} deux questions : celle du choix de la mesure du similarit{\'e} d{'}une part et celle du choix des op{\'e}randes sur lesquelles se porte la mesure de similarit{\'e}.

Sentence Sentence Similarity

Multilingual Epidemiological Text Classification: A Comparative Study

no code implementations COLING 2020 Stephen Mutuvi, Emanuela Boros, Antoine Doucet, Adam Jatowt, Ga{\"e}l Lejeune, Moses Odeo

We conduct a comparative study of different machine and deep learning text classification models using a dataset comprising news articles related to epidemic outbreaks from six languages, four low-resourced and two high-resourced, in order to analyze the influence of the nature of the language, the structure of the document, and the size of the data.

Multilingual text classification text-classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.