Search Results for author: Laurent Romary

Found 24 papers, 2 papers with code

Building A Corporate Corpus For Threads Constitution

no code implementations RANLP 2021 Lionel Tadonfouet Tadjou, Fabrice Bourge, Tiphaine Marie, Laurent Romary, Éric de la Clergerie

In this paper we describe the process of build-ing a corporate corpus that will be used as a ref-erence for modelling and computing threadsfrom conversations generated using commu-nication and collaboration tools.

BERTrade: Using Contextual Embeddings to Parse Old French

no code implementations LREC 2022 Loïc Grobol, Mathilde Regnault, Pedro Ortiz Suarez, Benoît Sagot, Laurent Romary, Benoit Crabbé

The successes of contextual word embeddings learned by training large-scale language models, while remarkable, have mostly occurred for languages where significant amounts of raw texts are available and where annotated data in downstream tasks have a relatively regular spelling.

Dependency Parsing POS +3

Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units

no code implementations25 Mar 2024 Biswesh Mohapatra, Seemab Hassan, Laurent Romary, Justine Cassell

We discuss our key findings during the annotation and also provide a baseline model to test the performance of current Language Models in categorizing the grounding acts of the dialogs.

Towards a Cleaner Document-Oriented Multilingual Crawled Corpus

no code implementations LREC 2022 Julien Abadji, Pedro Ortiz Suarez, Laurent Romary, Benoît Sagot

The need for raw large raw corpora has dramatically increased in recent years with the introduction of transfer learning and semi-supervised learning methods to Natural Language Processing.

Transfer Learning

Les mod\`eles de langue contextuels Camembert pour le fran\ccais : impact de la taille et de l'h\'et\'erog\'en\'eit\'e des donn\'ees d'entrainement (C AMEM BERT Contextual Language Models for French: Impact of Training Data Size and Heterogeneity )

no code implementations JEPTALNRECITAL 2020 Louis Martin, Benjamin Muller, Pedro Javier Ortiz Su{\'a}rez, Yoann Dupont, Laurent Romary, {\'E}ric Villemonte de la Clergerie, Beno{\^\i}t Sagot, Djam{\'e} Seddah

L{'}utilisation pratique de ces mod{\`e}les {---} dans toutes les langues sauf l{'}anglais {---} {\'e}tait donc limit{\'e}e. La sortie r{\'e}cente de plusieurs mod{\`e}les monolingues fond{\'e}s sur BERT (Devlin et al., 2019), notamment pour le fran{\c{c}}ais, a d{\'e}montr{\'e} l{'}int{\'e}r{\^e}t de ces mod{\`e}les en am{\'e}liorant l{'}{\'e}tat de l{'}art pour toutes les t{\^a}ches {\'e}valu{\'e}es.

SENTS

Modelling Etymology in LMF/TEI: The Grande Dicion\'ario Houaiss da L\'\ingua Portuguesa Dictionary as a Use Case

no code implementations LREC 2020 Fahad Khan, Laurent Romary, Ana Salgado, Jack Bowers, Mohamed Khemakhem, Toma Tasovac

In this article we will introduce two of the new parts of the new multi-part version of the Lexical Markup Framework (LMF) ISO standard, namely part 3 of the standard (ISO 24613-3), which deals with etymological and diachronic data, and Part 4 (ISO 24613-4), which consists of a TEI serialisation of all of the prior parts of the model.

Automatic Identification and Normalisation of Physical Measurements in Scientific Literature

1 code implementation Document Engineering 2019 Luca Foppiano, Laurent Romary, Masashi Ishii, Mikiko Tanifuji

Normalised materials characteristics (such as critical temperature, pressure) extracted from scientific literature are a key resource for materials informatics (MI) [9].

NER

LMF Reloaded

no code implementations23 May 2019 Laurent Romary, Mohamed Khemakhem, Fahad Khan, Jack Bowers, Nicoletta Calzolari, Monte George, Mandy Pet, Piotr Bański

Lexical Markup Framework (LMF) or ISO 24613 [1] is a de jure standard that provides a framework for modelling and encoding lexical information in retrodigitised print dictionaries and NLP lexical databases.

Deep encoding of etymological information in TEI

no code implementations30 Nov 2016 Jack Bowers, Laurent Romary

This paper aims to provide a comprehensive modeling and representation of etymological data in digital dictionaries.

TermITH-Eval: a French Standard-Based Resource for Keyphrase Extraction Evaluation

no code implementations LREC 2016 Adrien Bougouin, Sabine Barreaux, Laurent Romary, Florian Boudin, B{\'e}atrice Daille

The output keyphrases of automatic keyphrase extraction methods for test documents are typically evaluated by comparing them to manually assigned reference keyphrases.

Keyphrase Extraction

Data fluidity in DARIAH -- pushing the agenda forward

no code implementations10 Mar 2016 Laurent Romary, Mike Mertens, Anne Baillot

This paper provides both an update concerning the setting up of the European DARIAH infrastructure and a series of strong action lines related to the development of a data centred strategy for the humanities in the coming years.

Management

Standards for language resources in ISO -- Looking back at 13 fruitful years

no code implementations27 Oct 2015 Laurent Romary

This paper provides an overview of the various projects carried out within ISO committee TC 37/SC 4 dealing with the management of language (digital) resources.

Management

Méthodes pour la représentation informatisée de données lexicales / Methoden der Speicherung lexikalischer Daten

no code implementations15 May 2014 Laurent Romary, Andreas Witt

In recent years, new developments in the area of lexicography have altered not only the management, processing and publishing of lexicographical data, but also created new types of products such as electronic dictionaries and thesauri.

Management Translation

TBX goes TEI -- Implementing a TBX basic extension for the Text Encoding Initiative guidelines

no code implementations1 Mar 2014 Laurent Romary

This paper presents an attempt to customise the TEI (Text Encoding Initiative) guidelines in order to offer the possibility to incorporate TBX (TermBase eXchange) based terminological entries within any kind of TEI documents.

TEI and LMF crosswalks

no code implementations11 Jan 2013 Laurent Romary

The present paper explores various arguments in favour of making the Text Encoding Initia-tive (TEI) guidelines an appropriate serialisation for ISO standard 24613:2008 (LMF, Lexi-cal Mark-up Framework) .

Serialising the ISO SynAF Syntactic Object Model

no code implementations2 Aug 2011 Laurent Romary, Amir Zeldes, Florian Zipser

This paper introduces, an XML format developed to serialise the object model defined by the ISO Syntactic Annotation Framework SynAF.

Object

Cannot find the paper you are looking for? You can Submit a new open access paper.