Search Results for author: Jean-Baptiste Camps

Found 10 papers, 3 papers with code

Who could be behind QAnon? Authorship attribution with supervised machine-learning

no code implementations • 3 Mar 2023 • Florian Cafiero, Jean-Baptiste Camps

We conclude that two different individuals, Paul F. and Ron W., are the closest match to Q's linguistic signature, and they could have successively written Q's texts.

Authorship Attribution

Paper
Add Code

Lost Manuscripts and Extinct Texts: A Dynamic Model of Cultural Transmission

no code implementations • 29 Oct 2022 • Jean-Baptiste Camps, Julien Randon-Furling

How did written works evolve, disappear or survive down through the ages?

Paper
Add Code

Corpus and Models for Lemmatisation and POS-tagging of Old French

1 code implementation • 23 Sep 2021 • Jean-Baptiste Camps, Thibault Clérice, Frédéric Duval, Lucence Ing, Naomi Kanaoka, Ariane Pinche

Old French is a typical example of an under-resourced historic languages, that furtherly displays animportant amount of linguistic variation.

POS POS Tagging

Paper
Code

Handling Heavily Abbreviated Manuscripts: HTR engines vs text normalisation approaches

no code implementations • 7 Jul 2021 • Jean-Baptiste Camps, Chahan Vidal-Gorène, Marguerite Vernet

Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce.

Handwritten Text Recognition HTR

Paper
Add Code

Stylometry for Noisy Medieval Data: Evaluating Paul Meyer's Hagiographic Hypothesis

1 code implementation • 7 Dec 2020 • Jean-Baptiste Camps, Thibault Clérice, Ariane Pinche

Stylometric analysis of medieval vernacular texts is still a significant challenge: the importance of scribal variation, be it spelling or more substantial, as well as the variants and errors introduced in the tradition, complicate the task of the would-be stylometrist.

Handwritten Text Recognition

Paper
Code

Standardizing linguistic data: method and tools for annotating (pre-orthographic) French

no code implementations • 22 Nov 2020 • Simon Gabay, Thibault Clérice, Jean-Baptiste Camps, Jean-Baptiste Tanguy, Matthias Gille-Levenson

With the development of big corpora of various periods, it becomes crucial to standardise linguistic annotation (e. g. lemmas, POS tags, morphological annotation) to increase the interoperability of the data produced, despite diachronic variations.

POS

Paper
Add Code

Corpus and Models for Lemmatisation and POS-tagging of Classical French Theatre

no code implementations • 15 May 2020 • Jean-Baptiste Camps, Simon Gabay, Paul Fièvre, Thibault Clérice, Florian Cafiero

This paper describes the process of building an annotated corpus and training models for classical French literature, with a focus on theatre, and particularly comedies in verse.

POS POS Tagging

Paper
Add Code

Why Molière most likely did write his plays

2 code implementations • 2 Jan 2020 • Florian Cafiero, Jean-Baptiste Camps

As for Shakespeare, a hard-fought debate has emerged about Moli\`ere, a supposedly uneducated actor who, according to some, could not have written the masterpieces attributed to him.

Paper
Code

Producing Corpora of Medieval and Premodern Occitan

no code implementations • 26 Apr 2019 • Jean-Baptiste Camps, Gilles Guilhem Couffignal

At a time when the quantity of - more or less freely - available data is increasing significantly, thanks to digital corpora, editions or libraries, the development of data mining tools or deep learning methods allows researchers to build a corpus of study tailored for their research, to enrich their data and to exploit them. Open optical character recognition (OCR) tools can be adapted to old prints, incunabula or even manuscripts, with usable results, allowing the rapid creation of textual corpora.

Lemmatization Optical Character Recognition +1

Paper
Add Code

Manuscripts in Time and Space: Experiments in Scriptometrics on an Old French Corpus

no code implementations • 30 Jan 2018 • Jean-Baptiste Camps

This results in multiple and hard to distinguish linguistic strata -- the author's scripta interacting with the scriptae of the various scribes -- in a context where literary written language is already a dialectal hybrid.

Clustering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.