Search Results for author: Benjamin Lecouteux

Found 41 papers, 8 papers with code

What has LeBenchmark Learnt about French Syntax?

no code implementations4 Mar 2024 Zdravko Dugonjić, Adrien Pupier, Benjamin Lecouteux, Maximin Coavoux

They are trained on very low level information (the raw speech signal), and do not have explicit lexical knowledge.

Automatic Speech Recognition speech-recognition +2

Pre-training for Speech Translation: CTC Meets Optimal Transport

1 code implementation27 Jan 2023 Phuong-Hang Le, Hongyu Gong, Changhan Wang, Juan Pino, Benjamin Lecouteux, Didier Schwab

Nevertheless, CTC is only a partial solution and thus, in our second contribution, we propose a novel pre-training method combining CTC and optimal transport to further reduce this gap.

Multi-Task Learning Speech-to-Text Translation +1

ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020

no code implementations WS 2020 Maha Elbayad, Ha Nguyen, Fethi Bougares, Natalia Tomashenko, Antoine Caubrière, Benjamin Lecouteux, Yannick Estève, Laurent Besacier

This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2020, offline speech translation and simultaneous speech translation.

Data Augmentation Translation

The LIG system for the English-Czech Text Translation Task of IWSLT 2019

no code implementations EMNLP (IWSLT) 2019 Loïc Vial, Benjamin Lecouteux, Didier Schwab, Hang Le, Laurent Besacier

Therefore, we implemented a Transformer-based encoder-decoder neural system which is able to use the output of a pre-trained language model as input embeddings, and we compared its performance under three configurations: 1) without any pre-trained language model (constrained), 2) using a language model trained on the monolingual parts of the allowed English-Czech data (constrained), and 3) using a language model trained on a large quantity of external monolingual data (unconstrained).

Language Modelling Machine Translation +1

Sense Vocabulary Compression through the Semantic Knowledge of WordNet for Neural Word Sense Disambiguation

2 code implementations GWC 2019 Loïc Vial, Benjamin Lecouteux, Didier Schwab

In this article, we tackle the issue of the limited quantity of manually sense annotated corpora for the task of word sense disambiguation, by exploiting the semantic relationships between senses such as synonymy, hypernymy and hyponymy, in order to compress the sense vocabulary of Princeton WordNet, and thus reduce the number of different sense tags that must be observed to disambiguate all words of the lexical database.

Word Sense Disambiguation

Improving the Coverage and the Generalization Ability of Neural Word Sense Disambiguation through Hypernymy and Hyponymy Relationships

no code implementations2 Nov 2018 Loïc Vial, Benjamin Lecouteux, Didier Schwab

Our method leads to state of the art results on most WSD evaluation tasks, while improving the coverage of supervised systems, reducing the training time and the size of the models, without additional training data.

Word Sense Disambiguation

Analyzing Learned Representations of a Deep ASR Performance Prediction Model

no code implementations WS 2018 Zied Elloumi, Laurent Besacier, Olivier Galibert, Benjamin Lecouteux

In a previous paper, we presented an ASR performance prediction system using CNNs that encode both text (ASR transcript) and speech, in order to predict word error rate.

Multi-Task Learning TAG

Disentangling ASR and MT Errors in Speech Translation

no code implementations MTSummit 2017 Ngoc-Tien Le, Benjamin Lecouteux, Laurent Besacier

This enables - as a by-product - qualitative analysis on the SLT errors and their origin (are they due to transcription or to translation step?)

Translation

Uniformisation de corpus anglais annot\'es en sens (Unification of sense annotated English corpora for word sense disambiguation)

no code implementations JEPTALNRECITAL 2017 Lo{\"\i}c Vial, Benjamin Lecouteux, Didier Schwab

Pour la d{\'e}sambigu{\"\i}sation lexicale en anglais, on compte aujourd{'}hui une quinzaine de corpus annot{\'e}s en sens dans des formats souvent diff{\'e}rents et provenant de diff{\'e}rentes versions du Princeton WordNet.

Word Sense Disambiguation

Traitement des Mots Hors Vocabulaire pour la Traduction Automatique de Document OCRis\'es en Arabe (This article presents a new system that automatically translates images of arabic documents)

no code implementations JEPTALNRECITAL 2017 Kamel Bouzidi, Zied Elloumi, Laurent Besacier, Benjamin Lecouteux, Mohamed-Faouzi Benzeghiba

Les exp{\'e}rimentations sont r{\'e}alis{\'e}s sur un corpus de journaux num{\'e}ris{\'e}s en arabe et permettent d{'}obtenir des am{\'e}liorations en score BLEU de 3, 73 et 5, 5 sur les corpus de d{\'e}veloppement et de test respectivement.

Optical Character Recognition (OCR)

CirdoX: an on/off-line multisource speech and sound analysis software

no code implementations LREC 2016 Fr{\'e}d{\'e}ric Aman, Michel Vacher, Fran{\c{c}}ois Portet, William Duclot, Benjamin Lecouteux

C IRDO X, which is a modular software, is able to analyse online the audio environment in a home, to extract the uttered sentences and then to process them thanks to an ASR module.

General Classification

Utilisation de mesures de confiance pour am\'eliorer le d\'ecodage en traduction de parole

no code implementations JEPTALNRECITAL 2015 Laurent Besacier, Benjamin Lecouteux, Luong Ngoc Quang

Les mesures de confiance au niveau mot (Word Confidence Estimation - WCE) pour la traduction auto- matique (TA) ou pour la reconnaissance automatique de la parole (RAP) attribuent un score de confiance {\`a} chaque mot dans une hypoth{\`e}se de transcription ou de traduction.

Cannot find the paper you are looking for? You can Submit a new open access paper.