Search Results for author: Simon Gabay

Found 5 papers, 0 papers with code

Le projet FREEM : ressources, outils et enjeux pour l’étude du français d’Ancien Régime (The F RE EM project: Resources, tools and challenges for the study of Ancien Régime French)

no code implementations JEP/TALN/RECITAL 2022 Simon Gabay, Pedro Ortiz Suarez, Rachel Bawden, Alexandre Bartz, Philippe Gambette, Benoît Sagot

En dépit de leur qualité certaine, les ressources et outils disponibles pour l’analyse du français d’Ancien Régime ne sont plus à même de répondre aux enjeux de la recherche en linguistique et en littérature pour cette période.

From FreEM to D'AlemBERT: a Large Corpus and a Language Model for Early Modern French

no code implementations18 Feb 2022 Simon Gabay, Pedro Ortiz Suarez, Alexandre Bartz, Alix Chagué, Rachel Bawden, Philippe Gambette, Benoît Sagot

Because these historical states are at the same time more complex to process and more scarce in the corpora available, specific efforts are necessary to train natural language processing (NLP) tools adapted to the data.

Language Modelling Natural Language Processing +2

Standardizing linguistic data: method and tools for annotating (pre-orthographic) French

no code implementations22 Nov 2020 Simon Gabay, Thibault Clérice, Jean-Baptiste Camps, Jean-Baptiste Tanguy, Matthias Gille-Levenson

With the development of big corpora of various periods, it becomes crucial to standardise linguistic annotation (e. g. lemmas, POS tags, morphological annotation) to increase the interoperability of the data produced, despite diachronic variations.


Corpus and Models for Lemmatisation and POS-tagging of Classical French Theatre

no code implementations15 May 2020 Jean-Baptiste Camps, Simon Gabay, Paul Fièvre, Thibault Clérice, Florian Cafiero

This paper describes the process of building an annotated corpus and training models for classical French literature, with a focus on theatre, and particularly comedies in verse.


Cannot find the paper you are looking for? You can Submit a new open access paper.