no code implementations • JEP/TALN/RECITAL 2022 • Thibault Bañeras Roux, Mickaël Rouvier, Jane Wottawa, Richard Dufour
L’évaluation de transcriptions issues de systèmes de Reconnaissance Automatique de la Parole (RAP) est un problème difficile et toujours ouvert, qui se résume généralement à ne considérer que le WER.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • JEP/TALN/RECITAL 2022 • Yanis Labrak, Philippe Turcotte, Richard Dufour, Mickael Rouvier
Nous proposons trois systèmes de classification reposant sur des caractéristiques extraites de plongements de mots contextuels issus d’un modèle BERT (CamemBERT).
no code implementations • JEP/TALN/RECITAL 2022 • Arthur Amalvy, Vincent Labatut, Richard Dufour
La reconnaissance d’entités nommées est une tâche de traitement automatique du langage naturel bien étudiée et utile dans de nombreuses applications.
no code implementations • 18 Jan 2025 • Antoine Tholly, Jane Wottawa, Mickael Rouvier, Richard Dufour
Automatic Speech Recognition (ASR) transcription errors are commonly assessed using metrics that compare them with a reference transcription, such as Word Error Rate (WER), which measures spelling deviations from the reference, or semantic score-based metrics.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 9 Jan 2025 • Léane Jourdan, Nicolas Hernandez, Richard Dufour, Florian Boudin, Akiko Aizawa
Revision is a crucial step in scientific writing, where authors refine their work to improve clarity, structure, and academic quality.
no code implementations • 30 Dec 2024 • Julien Aubert-Béduchaud, Florian Boudin, Béatrice Daille, Richard Dufour
Familiarizing oneself with a new scientific field and its existing literature can be daunting due to the large amount of available articles.
1 code implementation • 16 Dec 2024 • Arthur Amalvy, Vincent Labatut, Richard Dufour
The automatic extraction of character networks from literary texts is generally carried out using natural language processing (NLP) cascading pipelines.
1 code implementation • 30 Sep 2024 • Noé Cecillon, Vincent Labatut, Richard Dufour, Nejat Arinik
In this article, we tackle this issue by proposing two approaches to learning whole-graph representations of general signed graphs.
1 code implementation • 2 Jul 2024 • Arthur Amalvy, Vincent Labatut, Richard Dufour
Renard (Relationships Extraction from NARrative Documents) is a Python library that allows users to define custom natural language processing (NLP) pipelines to extract character networks from narrative texts.
no code implementations • 9 Jun 2024 • Yanis Labrak, Adel Moumen, Richard Dufour, Mickael Rouvier
In the rapidly evolving landscape of spoken question-answering (SQA), the integration of large language models (LLMs) has emerged as a transformative development.
no code implementations • 1 Mar 2024 • Leane Jourdan, Florian Boudin, Nicolas Hernandez, Richard Dufour
Writing a scientific article is a challenging task as it is a highly codified and specific genre, consequently proficiency in written communication is essential for effectively conveying research findings and ideas.
no code implementations • 29 Feb 2024 • Quentin Raymondaud, Mickael Rouvier, Richard Dufour
Following many researches in neural networks interpretability, we propose in this article a protocol that aims to determine which and where information is located in an ASR acoustic model (AM).
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 22 Feb 2024 • Yanis Labrak, Adrien Bazoge, Beatrice Daille, Mickael Rouvier, Richard Dufour
Subword tokenization has become the prevailing standard in the field of natural language processing (NLP) over recent years, primarily due to the widespread utilization of pre-trained language models.
1 code implementation • 20 Feb 2024 • Yanis Labrak, Adrien Bazoge, Oumaima El Khettari, Mickael Rouvier, Pacome Constant dit Beaufils, Natalia Grabar, Beatrice Daille, Solen Quiniou, Emmanuel Morin, Pierre-Antoine Gourraud, Richard Dufour
This limitation hampers the evaluation of the latest French biomedical models, as they are either assessed on a minimal number of tasks with non-standardized protocols or evaluated using general downstream tasks.
1 code implementation • 19 Feb 2024 • Anas Belfathi, Ygor Gallina, Nicolas Hernandez, Richard Dufour, Laura Monceaux
Recent advances in pre-trained language modeling have facilitated significant progress across various natural language processing (NLP) tasks.
1 code implementation • 15 Feb 2024 • Yanis Labrak, Adrien Bazoge, Emmanuel Morin, Pierre-Antoine Gourraud, Mickael Rouvier, Richard Dufour
This marks the first large-scale multilingual evaluation of LLMs in the medical domain.
Ranked #8 on
Few-Shot Learning
on MedConceptsQA
1 code implementation • 16 Oct 2023 • Arthur Amalvy, Vincent Labatut, Richard Dufour
Using this dataset, we train a neural context retriever based on a BERT model that is able to find relevant context for NER.
no code implementations • 22 Jul 2023 • Yanis Labrak, Mickael Rouvier, Richard Dufour
We evaluate four state-of-the-art instruction-tuned large language models (LLMs) -- ChatGPT, Flan-T5 UL2, Tk-Instruct, and Alpaca -- on a set of 13 real-world clinical and biomedical natural language processing (NLP) tasks in English, such as named-entity recognition (NER), question-answering (QA), relation extraction (RE), etc.
1 code implementation • 4 May 2023 • Arthur Amalvy, Vincent Labatut, Richard Dufour
Pre-trained transformer-based models have recently shown great performance when applied to Named Entity Recognition (NER).
1 code implementation • LOUHI 2022 • Yanis Labrak, Adrien Bazoge, Richard Dufour, Mickael Rouvier, Emmanuel Morin, Béatrice Daille, Pierre-Antoine Gourraud
This paper introduces FrenchMedMCQA, the first publicly available Multiple-Choice Question Answering (MCQA) dataset in French for medical domain.
no code implementations • 3 Apr 2023 • Yanis Labrak, Adrien Bazoge, Richard Dufour, Mickael Rouvier, Emmanuel Morin, Béatrice Daille, Pierre-Antoine Gourraud
In recent years, pre-trained language models (PLMs) achieve the best performance on a wide range of natural language processing (NLP) tasks.
1 code implementation • 29 Mar 2023 • Léane Jourdan, Florian Boudin, Richard Dufour, Nicolas Hernandez
Writing a scientific article is a challenging task as it is a highly codified genre.
1 code implementation • 9 Feb 2023 • Arthur Amalvy, Vincent Labatut, Richard Dufour
Named Entity Recognition (NER) is a low-level task often used as a foundation for solving higher level NLP problems.
1 code implementation • International Conference on Text, Speech and Dialogue (TSD) 2022 • Yanis Labrak, Richard Dufour
Part-of-speech (POS) tagging is a classical natural language processing (NLP) task.
Ranked #1 on
Part-Of-Speech Tagging
on ANTILLES
no code implementations • 20 Apr 2022 • Qingyu Chen, Alexis Allot, Robert Leaman, Rezarta Islamaj Doğan, Jingcheng Du, Li Fang, Kai Wang, Shuo Xu, Yuefu Zhang, Parsa Bagherzadeh, Sabine Bergler, Aakash Bhatnagar, Nidhir Bhavsar, Yung-Chun Chang, Sheng-Jie Lin, Wentai Tang, Hongtong Zhang, Ilija Tavchioski, Senja Pollak, Shubo Tian, Jinfeng Zhang, Yulia Otmakhova, Antonio Jimeno Yepes, Hang Dong, Honghan Wu, Richard Dufour, Yanis Labrak, Niladri Chatterjee, Kushagri Tandon, Fréjus Laleye, Loïc Rakotoson, Emmanuele Chersoni, Jinghang Gu, Annemarie Friedrich, Subhash Chandra Pujari, Mariia Chizhikova, Naveen Sivadasan, Zhiyong Lu
To close the gap, we organized the BioCreative LitCovid track to call for a community effort to tackle automated topic annotation for COVID-19 literature.
no code implementations • 29 Sep 2021 • Mathias Quillot, Richard Dufour, Jean-français Bonastre
To address this problem, we propose a new semi-supervised learning method entitled Label Refining that consists in extracting refined labels (e. g. vocal characteristics) from known initial labels (e. g. character played in a recording).
no code implementations • JEPTALNRECITAL 2020 • Adrien Gresse, Mathias Quillot, Richard Dufour, Jean-Fran{\c{c}}ois Bonastre
Les exp{\'e}riences men{\'e}es sur des extraits de voix de jeux vid{\'e}o montrent une am{\'e}lioration significative de l{'}approche p-vecteur, avec distillation de la connaissance, par rapport {\`a} une repr{\'e}sentation x-vecteur, {\'e}tat-de-l{'}art en reconnaissance du locuteur.
no code implementations • JEPTALNRECITAL 2020 • Mathias Quillot, Lauriane Guillou, Adrien Gresse, Rafa{\"e}l Ferro, Rapha{\"e}l R{\"o}th, Damien Malinas, Richard Dufour, Axel Roebel, Nicolas Obin, Jean-Fran{\c{c}}ois Bonastre, Emmanuel Ethis
La voix act{\'e}e repr{\'e}sente un d{\'e}fi majeur pour les futures interfaces vocales avec un potentiel d{'}application extr{\^e}mement important pour la transformation num{\'e}rique des secteurs de la culture et de la communication, comme la production ou la post-production de voix pour les s{\'e}ries ou le cin{\'e}ma.
no code implementations • LREC 2020 • Salima Mdhaffar, Yannick Est{\`e}ve, Antoine Laurent, Hern, Nicolas ez, Richard Dufour, Delphine Charlet, Geraldine Damnati, Solen Quiniou, Nathalie Camelin
The use cases concern scientific fields from both speech and text processing, with language model adaptation, thematic segmentation and transcription to slide alignment.
1 code implementation • LREC 2020 • Noé Cecillon, Vincent Labatut, Richard Dufour, Georges Linares
This large corpus of more than 380k annotated messages opens perspectives for online abuse detection and especially for context-based approaches.
1 code implementation • 20 May 2019 • Noé Cecillon, Vincent Labatut, Richard Dufour, Georges Linarès
In recent years, online social networks have allowed worldwide users to meet and discuss.
no code implementations • 31 Jan 2019 • Etienne Papegnies, Vincent Labatut, Richard Dufour, Georges Linares
We identify the most appropriate network extraction parameters and discuss the discriminative power of our features, relatively to their topological and temporal nature.
no code implementations • 3 Aug 2017 • Etienne Papegnies, Vincent Labatut, Richard Dufour, Georges Linares
While online communities have become increasingly important over the years, the moderation of user-generated content is still performed mostly manually.
no code implementations • 20 Mar 2017 • Mohamed Morchid, Juan-Manuel Torres-Moreno, Richard Dufour, Javier Ramírez-Rodríguez, Georges Linarès
One of the main difficulty in using topic model on huge data collection is related to the material resources (CPU time and memory) required for model estimate.
no code implementations • 21 Feb 2017 • Xavier Bost, Ilaria Brunetti, Luis Adrián Cabrera-Diego, Jean-Valère Cossu, Andréa Linhares, Mohamed Morchid, Juan-Manuel Torres-Moreno, Marc El-Bèze, Richard Dufour
The 2013 D\'efi de Fouille de Textes (DEFT) campaign is interested in two types of language analysis tasks, the document classification and the information extraction in the specialized domain of cuisine recipes.
no code implementations • 11 Feb 2017 • Mohamed Bouaziz, Mohamed Morchid, Richard Dufour, Georges Linarès, Renato de Mori
Nevertheless, these RNNs process a single input stream in one (LSTM) or two (Bidirectional LSTM) directions.
no code implementations • JEPTALNRECITAL 2016 • Mohamed Bouaziz, Mohamed Morchid, Richard Dufour, Georges Linar{\`e}s, Prosper Correa
Cet article pr{\'e}sente une m{\'e}thode de pr{\'e}diction de genres d{'}{\'e}missions t{\'e}l{\'e}vis{\'e}es couvrant 2 jours de diffusion de 4 cha{\^\i}nes TV fran{\c{c}}aises structur{\'e}s en {\'e}missions annot{\'e}es en genres.
no code implementations • JEPTALNRECITAL 2016 • Mohamed Bouaziz, Mohamed Morchid, Pierre-Michel Bousquet, Richard Dufour, Killian Janod, Waad Ben Kheder, Georges Linar{\`e}s
Les applications de compr{\'e}hension du langage parl{\'e} sont moins performantes si les documents transcrits automatiquement contiennent un taux d{'}erreur-mot {\'e}lev{\'e}.
no code implementations • JEPTALNRECITAL 2016 • Killian Janod, Mohamed Morchid, Richard Dufour, Georges Linar{\`e}s, Renato de Mori
Les repr{\'e}sentations de documents au moyen d{'}approches {\`a} base de r{\'e}seaux de neurones ont montr{\'e} des am{\'e}liorations significatives dans de nombreuses t{\^a}ches du traitement du langage naturel.
no code implementations • JEPTALNRECITAL 2015 • Killian Janod, Mohamed Morchid, Richard Dufour, Georges Linares
Ces approches sont manipul{\'e}es au travers d{'}un r{\'e}seau de neurones, l{'}architecture CBOW cherchant alors {\`a} pr{\'e}dire un mot sachant son contexte, alors que l{'}architecture Skip-Gram pr{\'e}dit un contexte sachant un mot.
no code implementations • JEPTALNRECITAL 2015 • Mohamed Morchid, Richard Dufour, Georges Linar{\`e}s
La m{\'e}thode propos{\'e}e consiste {\`a} configurer la topologie d{'}un ANN ainsi que d{'}initialiser les connexions de celui-ci {\`a} l{'}aide des espaces th{\'e}matiques appris pr{\'e}c{\'e}demment.
no code implementations • LREC 2014 • Mohamed Morchid, Georges Linar{\`e}s, Richard Dufour
The prediction of bursty events on the Internet is a challenging task.
no code implementations • LREC 2014 • Mohamed Morchid, Richard Dufour, Georges Linar{\`e}s
Although the current transcription systems could achieve high recognition performance, they still have a lot of difficulties to transcribe speech in very noisy environments.
no code implementations • JEPTALNRECITAL 2012 • Richard Dufour, G{\'e}raldine Damnati, Delphine Charlet