no code implementations • 29 Aug 2023 • Etienne Labbé, Thomas Pellegrini, Julien Pinquier
For ATR, we propose using the standard Cross-Entropy loss values obtained for any audio/caption pair.
1 code implementation • 14 Nov 2022 • Etienne Labbé, Thomas Pellegrini, Julien Pinquier
For this reason, several complementary metrics, such as BLEU, CIDEr, SPICE and SPIDEr, are used to compare a single automatic caption to one or several captions of reference, produced by a human annotator.
no code implementations • 9 Jun 2022 • Lionel Pibre, Francisco Madrigal, Cyrille Equoy, Frédéric Lerasle, Thomas Pellegrini, Julien Pinquier, Isabelle Ferrané
In this paper, we propose two different types of fusion for the detection of the active speaker, combining two visual modalities and an audio modality through neural networks.
no code implementations • 4 Mar 2021 • Lucile Gelin, Morgane Daniel, Julien Pinquier, Thomas Pellegrini
Through transfer learning, a Transformer model complemented with a Connectionist Temporal Classification (CTC) objective function, reaches a phone error rate of 28. 1%, outperforming a state-of-the-art DNN-HMM model by 6. 6% relative, as well as other end-to-end architectures by more than 8. 5% relative.
no code implementations • JEPTALNRECITAL 2020 • S{\'e}bastien Ferreira, J{\'e}r{\^o}me Farinas, Julien Pinquier, Julie Mauclair, St{\'e}phane Rabant
Dans cette {\'e}tude, nous explorons la pr{\'e}diction a priori de la qualit{\'e} de la transcription automatique de la parole dans le cas de la parole r{\'e}verb{\'e}r{\'e}e enregistr{\'e}e avec un seul microphone.
no code implementations • JEPTALNRECITAL 2020 • Lucile Gelin, Morgane Daniel, Thomas Pellegrini, Julien Pinquier
A conditions {\'e}gales, les performances actuelles de la reconnaissance vocale pour enfants sont inf{\'e}rieures {\`a} celles des syst{\`e}mes pour adultes.
no code implementations • JEPTALNRECITAL 2020 • R, Estelle ria, Lionel Fontan, Maxime Le Coz, Isabelle Ferran{\'e}, Julien Pinquier
La compr{\'e}hensibilit{\'e} de documents audiovisuels peut d{\'e}pendre de facteurs propres {\`a} l{'}auditeur/spectateur (ex.
no code implementations • JEPTALNRECITAL 2020 • S{\'e}bastien Ferreira, J{\'e}r{\^o}me Farinas, Julien Pinquier, Julie Mauclair, St{\'e}phane Rabant
La Reconnaissance Automatique de la Parole (RAP) est moins performante lorsque le signal de parole est de mauvaise qualit{\'e}.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • LREC 2020 • R, Estelle ria, Lionel Fontan, Maxime Le Coz, Isabelle Ferran{\'e}, Julien Pinquier
Various research works have dealt with the comprehensibility of textual, audio, or audiovisual documents, and showed that factors related to text (e. g. linguistic complexity), sound (e. g. speech intelligibility), image (e. g. presence of visual context), or even to cognition and emotion can play a major role in the ability of humans to understand the semantic and pragmatic contents of a given document.
no code implementations • 9 Mar 2020 • Vincent Roger, Jérôme Farinas, Julien Pinquier
In that sense we propose an overview of few-shot techniques and perspectives of using such techniques for the focused speech problems in this survey.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 21 Oct 2019 • Geoffrey Roman-Jimenez, Patrice Guyot, Thierry Malon, Sylvie Chambon, Vincent Charvillat, Alain Crouzil, André Péninou, Julien Pinquier, Florence Sedes, Christine Sénac
We compared T2TP with I2TP using the same CNN models.
no code implementations • WS 2019 • Bruno Gaume, Lydia Mai Ho-Dac, Ludovic Tanguy, C{\'e}cile Fabre, B{\'e}n{\'e}dicte Pierrejean, Nabil Hathout, J{\'e}r{\^o}me Farinas, Julien Pinquier, Lola Danet, Patrice P{\'e}ran, Xavier De Boissezon, M{\'e}lanie Jucla
This paper presents the first results of a multidisciplinary project, the {``}Evolex{''} project, gathering researchers in Psycholinguistics, Neuropsychology, Computer Science, Natural Language Processing and Linguistics.
no code implementations • LREC 2018 • Corine Ast{\'e}sano, Mathieu Balaguer, J{\'e}r{\^o}me Farinas, Corinne Fredouille, Pascal Gaillard, Alain Ghio, Imed Laaridh, Muriel Lalain, Beno{\^\i}t Lepage, Julie Mauclair, Olivier Nocaudie, Julien Pinquier, Oriol Pont, Gilles Pouchoulin, Mich{\`e}le Puech, Dani{\`e}le Robert, Etienne Sicard, Virginie Woisard
no code implementations • JEPTALNRECITAL 2016 • C{\'e}line Manenti, Thomas Pellegrini, Julien Pinquier
Dans cet article, nous d{\'e}crivons une {\'e}tude exp{\'e}rimentale de segmentation de parole en unit{\'e}s acoustiques sous-lexicales (phones) {\`a} l{'}aide de r{\'e}seaux de neurones.