Search Results for author: Laurent Besacier

Found 119 papers, 37 papers with code

Phone Based Keyword Spotting for Transcribing Very Low Resource Languages

no code implementations ALTA 2021 Eric Le Ferrand, Steven Bird, Laurent Besacier

We investigate the efficiency of two very different spoken term detection approaches for transcription when the available data is insufficient to train a robust speech recognition system.

Dynamic Time Warping Keyword Spotting +2

Investigating the Impact of Gender Representation in ASR Training Data: a Case Study on Librispeech

no code implementations ACL (GeBNLP) 2021 Mahault Garnerin, Solange Rossato, Laurent Besacier

In this paper we question the impact of gender representation in training data on the performance of an end-to-end ASR system.

Weakly Supervised Word Segmentation for Computational Language Documentation

1 code implementation ACL 2022 Shu Okabe, Laurent Besacier, François Yvon

Word and morpheme segmentation are fundamental steps of language documentation as they allow to discover lexical units in a language for which the lexicon is unknown.

Incremental Learning Segmentation

Learning From Failure: Data Capture in an Australian Aboriginal Community

no code implementations ACL 2022 Eric Le Ferrand, Steven Bird, Laurent Besacier

Most low resource language technology development is premised on the need to collect data for training statistical models.

Contribution d’informations syntaxiques aux capacités de généralisation compositionelle des modèles seq2seq convolutifs (Assessing the Contribution of Syntactic Information for Compositional Generalization of seq2seq Convolutional Networks)

no code implementations JEP/TALN/RECITAL 2021 Diana Nicoleta Popa, William N. Havard, Maximin Coavoux, Eric Gaussier, Laurent Besacier

Le jeu de données SCAN, constitué d’un ensemble de commandes en langage naturel associées à des séquences d’action, a été spécifiquement conçu pour évaluer les capacités des réseaux de neurones à apprendre ce type de généralisation compositionnelle.

Visualizing Cross‐Lingual Discourse Relations in Multilingual TED Corpora

1 code implementation CODI 2021 Zae Myung Kim, Vassilina Nikoulina, Dongyeop Kang, Didier Schwab, Laurent Besacier

This paper presents an interactive data dashboard that provides users with an overview of the preservation of discourse relations among 28 language pairs.

Relation

Controlling Prosody in End-to-End TTS: A Case Study on Contrastive Focus Generation

no code implementations CoNLL (EMNLP) 2021 Siddique Latif, Inyoung Kim, Ioan Calapodescu, Laurent Besacier

In this paper, we investigate whether we can control prosody directly from the input text, in order to code information related to contrastive focus which emphasizes a specific word that is contrary to the presuppositions of the interlocutor.

Encoding Sentence Position in Context-Aware Neural Machine Translation with Concatenation

1 code implementation13 Feb 2023 Lorenzo Lupo, Marco Dinarelli, Laurent Besacier

Context-aware translation can be achieved by processing a concatenation of consecutive sentences with the standard Transformer architecture.

Machine Translation Position +2

Focused Concatenation for Context-Aware Neural Machine Translation

1 code implementation24 Oct 2022 Lorenzo Lupo, Marco Dinarelli, Laurent Besacier

A straightforward approach to context-aware neural machine translation consists in feeding the standard encoder-decoder architecture with a window of consecutive sentences, formed by the current sentence and a number of sentences from its context concatenated to it.

Machine Translation Sentence +1

A Textless Metric for Speech-to-Speech Comparison

1 code implementation21 Oct 2022 Laurent Besacier, Swen Ribeiro, Olivier Galibert, Ioan Calapodescu

In this paper, we introduce a new and simple method for comparing speech utterances without relying on text transcripts.

Sentence Speech-to-Speech Translation +1

SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages

3 code implementations20 Oct 2022 Alireza Mohammadshahi, Vassilina Nikoulina, Alexandre Berard, Caroline Brun, James Henderson, Laurent Besacier

In recent years, multilingual machine translation models have achieved promising performance on low-resource language pairs by sharing information between similar languages, thus enabling zero-shot translation.

Machine Translation Translation

BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model

no code implementations4 Jul 2022 Brooke Stephenson, Laurent Besacier, Laurent Girin, Thomas Hueber

We collect a corpus of utterances containing contrastive focus and we evaluate the accuracy of a BERT model, finetuned to predict quantized acoustic prominence features, on these samples.

Language Modelling Speech Synthesis +1

What Do Compressed Multilingual Machine Translation Models Forget?

1 code implementation22 May 2022 Alireza Mohammadshahi, Vassilina Nikoulina, Alexandre Berard, Caroline Brun, James Henderson, Laurent Besacier

In this work, we assess the impact of compression methods on Multilingual Neural Machine Translation models (MNMT) for various language groups, gender, and semantic biases by extensive analysis of compressed models on different machine translation benchmarks, i. e. FLORES-101, MT-Gender, and DiBiMT.

Machine Translation Memorization +1

A Study of Gender Impact in Self-supervised Models for Speech-to-Text Systems

no code implementations4 Apr 2022 Marcely Zanon Boito, Laurent Besacier, Natalia Tomashenko, Yannick Estève

These models are pre-trained on unlabeled audio data and then used in speech processing downstream tasks such as automatic speech recognition (ASR) or speech translation (ST).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Multilingual Unsupervised Neural Machine Translation with Denoising Adapters

no code implementations EMNLP 2021 Ahmet Üstün, Alexandre Bérard, Laurent Besacier, Matthias Gallé

We consider the problem of multilingual unsupervised machine translation, translating to and from languages that only have monolingual data by using auxiliary parallel language pairs.

Denoising Translation +1

On the Evaluation of Machine Translation for Terminology Consistency

1 code implementation22 Jun 2021 Md Mahfuz ibn Alam, Antonios Anastasopoulos, Laurent Besacier, James Cross, Matthias Gallé, Philipp Koehn, Vassilina Nikoulina

As neural machine translation (NMT) systems become an important part of professional translator pipelines, a growing body of work focuses on combining NMT with terminologies.

Domain Adaptation Machine Translation +2

Spoken Term Detection Methods for Sparse Transcription in Very Low-resource Settings

no code implementations11 Jun 2021 Éric Le Ferrand, Steven Bird, Laurent Besacier

We investigate the efficiency of two very different spoken term detection approaches for transcription when the available data is insufficient to train a robust ASR system.

Dynamic Time Warping

Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings

no code implementations SIGUL (LREC) 2022 Marcely Zanon Boito, Bolaji Yusuf, Lucas Ondel, Aline Villavicencio, Laurent Besacier

Our results suggest that neural models for speech discretization are difficult to exploit in our setting, and that it might be necessary to adapt them to limit sequence length.

Do Multilingual Neural Machine Translation Models Contain Language Pair Specific Attention Heads?

no code implementations Findings (ACL) 2021 Zae Myung Kim, Laurent Besacier, Vassilina Nikoulina, Didier Schwab

Recent studies on the analysis of the multilingual representations focus on identifying whether there is an emergence of language-independent representations, or whether a multilingual model partitions its weights among different languages.

Machine Translation NMT +1

Impact of Encoding and Segmentation Strategies on End-to-End Simultaneous Speech Translation

no code implementations29 Apr 2021 Ha Nguyen, Yannick Estève, Laurent Besacier

Boosted by the simultaneous translation shared task at IWSLT 2020, promising end-to-end online speech translation approaches were recently proposed.

Translation

Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models

1 code implementation ACL 2022 Lorenzo Lupo, Marco Dinarelli, Laurent Besacier

Multi-encoder models are a broad family of context-aware neural machine translation systems that aim to improve translation quality by encoding document-level contextual information alongside the current sentence.

Machine Translation Retrieval +2

Enabling Interactive Transcription in an Indigenous Community

no code implementations COLING 2020 Éric Le Ferrand, Steven Bird, Laurent Besacier

We propose a novel transcription workflow which combines spoken term detection and human-in-the-loop, together with a pilot experiment.

Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation

1 code implementation COLING 2020 Hang Le, Juan Pino, Changhan Wang, Jiatao Gu, Didier Schwab, Laurent Besacier

We propose two variants of these architectures corresponding to two different levels of dependencies between the decoders, called the parallel and cross dual-decoder Transformers, respectively.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

The Zero Resource Speech Challenge 2020: Discovering discrete subword and word units

no code implementations12 Oct 2020 Ewan Dunbar, Julien Karadayi, Mathieu Bernard, Xuan-Nga Cao, Robin Algayres, Lucas Ondel, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux

We present the Zero Resource Speech Challenge 2020, which aims at learning speech representations from raw audio signals without any labels.

Speech Synthesis

What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS

no code implementations4 Sep 2020 Brooke Stephenson, Laurent Besacier, Laurent Girin, Thomas Hueber

In this paper, we study the behavior of a neural sequence-to-sequence TTS system when used in an incremental mode, i. e. when generating speech output for token n, the system has access to n + k tokens from the text sequence.

Sentence Speech Synthesis +1

Catplayinginthesnow: Impact of Prior Segmentation on a Model of Visually Grounded Speech

no code implementations CONLL 2020 William N. Havard, Jean-Pierre Chevrot, Laurent Besacier

The language acquisition literature shows that children do not build their lexicon by segmenting the spoken input into phonemes and then building up words from them, but rather adopt a top-down approach and start by segmenting word-like units and then break them down into smaller units.

Image Retrieval Language Acquisition +1

ConfNet2Seq: Full Length Answer Generation from Spoken Questions

1 code implementation9 Jun 2020 Vaishali Pal, Manish Shrivastava, Laurent Besacier

This is the first attempt towards generating full-length natural answers from a graph input(confusion network) to the best of our knowledge.

Answer Generation Sentence +1

Repr\'esentation du genre dans des donn\'ees open source de parole (Gender representation in open source speech resources 1 With the rise of artificial intelligence (AI) and the growing use of deep-learning architectures, the question of ethics and transparency in AI systems has become a central concern within the research community)

no code implementations JEPTALNRECITAL 2020 Mahault Garnerin, Solange Rossato, Laurent Besacier

Avec l{'}essor de l{'}intelligence artificielle (IA) et l{'}utilisation croissante des architectures d{'}apprentissage profond, la question de l{'}{\'e}thique et de la transparence des syst{\`e}mes d{'}IA est devenue une pr{\'e}occupation centrale au sein de la communaut{\'e} de recherche.

Ethics

Pratiques d'\'evaluation en ASR et biais de performance (Evaluation methodology in ASR and performance bias)

no code implementations JEPTALNRECITAL 2020 Mahault Garnerin, Solange Rossato, Laurent Besacier

Nous proposons une r{\'e}flexion sur les pratiques d{'}{\'e}valuation des syst{\`e}mes de reconnaissance automatique de la parole (ASR).

ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020

no code implementations WS 2020 Maha Elbayad, Ha Nguyen, Fethi Bougares, Natalia Tomashenko, Antoine Caubrière, Benjamin Lecouteux, Yannick Estève, Laurent Besacier

This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2020, offline speech translation and simultaneous speech translation.

Data Augmentation Translation

Efficient Wait-k Models for Simultaneous Machine Translation

1 code implementation18 May 2020 Maha Elbayad, Laurent Besacier, Jakob Verbeek

We also show that the 2D-convolution architecture is competitive with Transformers for simultaneous translation of spoken language.

Machine Translation Translation

Investigating Language Impact in Bilingual Approaches for Computational Language Documentation

no code implementations LREC 2020 Marcely Zanon Boito, Aline Villavicencio, Laurent Besacier

For answering this question, we use the MaSS multilingual speech corpus (Boito et al., 2020) for creating 56 bilingual pairs that we apply to the task of low-resource unsupervised word segmentation and alignment.

Segmentation Translation

Gender Representation in Open Source Speech Resources

no code implementations LREC 2020 Mahault Garnerin, Solange Rossato, Laurent Besacier

With the rise of artificial intelligence (AI) and the growing use of deep-learning architectures, the question of ethics, transparency and fairness of AI systems has become a central concern within the research community.

Ethics Fairness

Modeling ASR Ambiguity for Dialogue State Tracking Using Word Confusion Networks

1 code implementation3 Feb 2020 Vaishali Pal, Fabien Guillot, Manish Shrivastava, Jean-Michel Renders, Laurent Besacier

Spoken dialogue systems typically use a list of top-N ASR hypotheses for inferring the semantic meaning and tracking the state of the dialogue.

Dialogue State Tracking Spoken Dialogue Systems

Character-based NMT with Transformer

no code implementations12 Nov 2019 Rohit Gupta, Laurent Besacier, Marc Dymetman, Matthias Gallé

Character-based translation has several appealing advantages, but its performance is in general worse than a carefully tuned BPE baseline.

NMT Translation

The LIG system for the English-Czech Text Translation Task of IWSLT 2019

no code implementations EMNLP (IWSLT) 2019 Loïc Vial, Benjamin Lecouteux, Didier Schwab, Hang Le, Laurent Besacier

Therefore, we implemented a Transformer-based encoder-decoder neural system which is able to use the output of a pre-trained language model as input embeddings, and we compared its performance under three configurations: 1) without any pre-trained language model (constrained), 2) using a language model trained on the monolingual parts of the allowed English-Czech data (constrained), and 3) using a language model trained on a large quantity of external monolingual data (unconstrained).

Language Modelling Machine Translation +1

Naver Labs Europe's Systems for the Document-Level Generation and Translation Task at WNGT 2019

no code implementations WS 2019 Fahimeh Saleh, Alexandre Bérard, Ioan Calapodescu, Laurent Besacier

To address these challenges, we propose to leverage data from both tasks and do transfer learning between MT, NLG, and MT with source-side metadata (MT+NLG).

Descriptive Machine Translation +4

ON-TRAC Consortium End-to-End Speech Translation Systems for the IWSLT 2019 Shared Task

no code implementations EMNLP (IWSLT) 2019 Ha Nguyen, Natalia Tomashenko, Marcely Zanon Boito, Antoine Caubriere, Fethi Bougares, Mickael Rouvier, Laurent Besacier, Yannick Esteve

This paper describes the ON-TRAC Consortium translation systems developed for the end-to-end model task of IWSLT Evaluation 2019 for the English-to-Portuguese language pair.

Translation

How Does Language Influence Documentation Workflow? Unsupervised Word Discovery Using Translations in Multiple Languages

1 code implementation11 Oct 2019 Marcely Zanon Boito, Aline Villavicencio, Laurent Besacier

For language documentation initiatives, transcription is an expensive resource: one minute of audio is estimated to take one hour and a half on average of a linguist's work (Austin and Sallabank, 2013).

Improved Training Techniques for Online Neural Machine Translation

no code implementations25 Sep 2019 Maha Elbayad, Laurent Besacier, Jakob Verbeek

We investigate the sensitivity of such models to the value of k that is used during training and when deploying the model, and the effect of updating the hidden states in transformer models as new source tokens are read.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Word Recognition, Competition, and Activation in a Model of Visually Grounded Speech

no code implementations CONLL 2019 William N. Havard, Jean-Pierre Chevrot, Laurent Besacier

In this paper, we study how word-like units are represented and activated in a recurrent neural model of visually grounded speech.

MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible

1 code implementation LREC 2020 Marcely Zanon Boito, William N. Havard, Mahault Garnerin, Éric Le Ferrand, Laurent Besacier

However, the fact that the source content (the Bible) is the same for all the languages is not exploited to date. Therefore, this article proposes to add multilingual links between speech segments in different languages, and shares a large and clean dataset of 8, 130 parallel spoken utterances across 8 languages (56 language pairs).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Empirical Evaluation of Sequence-to-Sequence Models for Word Discovery in Low-resource Settings

1 code implementation29 Jun 2019 Marcely Zanon Boito, Aline Villavicencio, Laurent Besacier

This task consists in aligning word sequences in a source language with phoneme sequences in a target language, inferring from it word segmentation on the target side [5].

Machine Translation

The Zero Resource Speech Challenge 2019: TTS without T

no code implementations25 Apr 2019 Ewan Dunbar, Robin Algayres, Julien Karadayi, Mathieu Bernard, Juan Benjumea, Xuan-Nga Cao, Lucie Miskic, Charlotte Dugrain, Lucas Ondel, Alan W. black, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux

We present the Zero Resource Speech Challenge 2019, which proposes to build a speech synthesizer without any text or phonetic labels: hence, TTS without T (text-to-speech without text).

Models of Visually Grounded Speech Signal Pay Attention To Nouns: a Bilingual Experiment on English and Japanese

1 code implementation8 Feb 2019 William N. Havard, Jean-Pierre Chevrot, Laurent Besacier

We investigate the behaviour of attention in neural models of visually grounded speech trained on two languages: English and Japanese.

Retrieval

Exploring Textual and Speech information in Dialogue Act Classification with Speaker Domain Adaptation

no code implementations ALTA 2018 Xuanli He, Quan Hung Tran, William Havard, Laurent Besacier, Ingrid Zukerman, Gholamreza Haffari

In spite of the recent success of Dialogue Act (DA) classification, the majority of prior works focus on text-based classification with oracle transcriptions, i. e. human transcriptions, instead of Automatic Speech Recognition (ASR)'s transcriptions.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Analyzing Learned Representations of a Deep ASR Performance Prediction Model

no code implementations WS 2018 Zied Elloumi, Laurent Besacier, Olivier Galibert, Benjamin Lecouteux

In a previous paper, we presented an ASR performance prediction system using CNNs that encode both text (ASR transcript) and speech, in order to predict word error rate.

Multi-Task Learning TAG

Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

3 code implementations CONLL 2018 Maha Elbayad, Laurent Besacier, Jakob Verbeek

Current state-of-the-art machine translation systems are based on encoder-decoder architectures, that first encode the input sequence, and then generate an output sequence based on the input encoding.

Machine Translation Translation

A small Griko-Italian speech translation corpus

no code implementations27 Jul 2018 Marcely Zanon Boito, Antonios Anastasopoulos, Marika Lekakou, Aline Villavicencio, Laurent Besacier

This paper presents an extension to a very low-resource parallel corpus collected in an endangered language, Griko, making it useful for computational research.

Translation

Unsupervised Word Segmentation from Speech with Attention

no code implementations18 Jun 2018 Pierre Godard, Marcely Zanon-Boito, Lucas Ondel, Alexandre Berard, François Yvon, Aline Villavicencio, Laurent Besacier

We present a first attempt to perform attentional word segmentation directly from the speech signal, with the final goal to automatically identify lexical units in a low-resource, unwritten language (UL).

Acoustic Unit Discovery Machine Translation +2

Token-level and sequence-level loss smoothing for RNN language models

1 code implementation ACL 2018 Maha Elbayad, Laurent Besacier, Jakob Verbeek

We extend this approach to token-level loss smoothing, and propose improvements to the sequence-level smoothing approach.

Image Captioning Machine Translation +1

End-to-End Automatic Speech Translation of Audiobooks

1 code implementation12 Feb 2018 Alexandre Bérard, Laurent Besacier, Ali Can Kocabiyikoglu, Olivier Pietquin

We investigate end-to-end speech-to-text translation on a corpus of audiobooks specifically augmented for this task.

Speech-to-Text Translation Translation

Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation

1 code implementation LREC 2018 Ali Can Kocabiyikoglu, Laurent Besacier, Olivier Kraif

However, while large quantities of parallel texts (such as Europarl, OpenSubtitles) are available for training machine translation systems, there are no large (100h) and open source parallel corpora that include speech in a source language aligned to text in a target language.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Disentangling ASR and MT Errors in Speech Translation

no code implementations MTSummit 2017 Ngoc-Tien Le, Benjamin Lecouteux, Laurent Besacier

This enables - as a by-product - qualitative analysis on the SLT errors and their origin (are they due to transcription or to translation step?)

Translation

Amharic-English Speech Translation in Tourism Domain

no code implementations WS 2017 Michael Melese, Laurent Besacier, Million Meshesha

This paper describes speech translation from Amharic-to-English, particularly Automatic Speech Recognition (ASR) with post-editing feature and Amharic-English Statistical Machine Translation (SMT).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

SPEECH-COCO: 600k Visually Grounded Spoken Captions Aligned to MSCOCO Data Set

1 code implementation26 Jul 2017 William Havard, Laurent Besacier, Olivier Rosec

Disfluencies and speed perturbation are added to the signal in order to sound more natural.

LIG-CRIStAL System for the WMT17 Automatic Post-Editing Task

no code implementations17 Jul 2017 Alexandre Berard, Olivier Pietquin, Laurent Besacier

This paper presents the LIG-CRIStAL submission to the shared Automatic Post- Editing task of WMT 2017.

Automatic Post-Editing Sentence

Traitement des Mots Hors Vocabulaire pour la Traduction Automatique de Document OCRis\'es en Arabe (This article presents a new system that automatically translates images of arabic documents)

no code implementations JEPTALNRECITAL 2017 Kamel Bouzidi, Zied Elloumi, Laurent Besacier, Benjamin Lecouteux, Mohamed-Faouzi Benzeghiba

Les exp{\'e}rimentations sont r{\'e}alis{\'e}s sur un corpus de journaux num{\'e}ris{\'e}s en arabe et permettent d{'}obtenir des am{\'e}liorations en score BLEU de 3, 73 et 5, 5 sur les corpus de d{\'e}veloppement et de test respectivement.

Optical Character Recognition (OCR)

Machine Assisted Analysis of Vowel Length Contrasts in Wolof

no code implementations1 Jun 2017 Elodie Gauthier, Laurent Besacier, Sylvie Voisin

Growing digital archives and improving algorithms for automatic analysis of text and speech create new research opportunities for fundamental research in phonetics.

Deep Investigation of Cross-Language Plagiarism Detection Methods

1 code implementation WS 2017 Jeremy Ferrero, Laurent Besacier, Didier Schwab, Frederic Agnes

This paper is a deep investigation of cross-language plagiarism detection methods on a new recently introduced open dataset, which contains parallel and comparable collections of documents with multiple characteristics (different genres, languages and sizes of texts).

Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation

1 code implementation6 Dec 2016 Alexandre Berard, Olivier Pietquin, Christophe Servan, Laurent Besacier

This paper proposes a first attempt to build an end-to-end speech-to-text translation system, which does not use source language transcription during learning or decoding.

Speech-to-Text Translation Translation

Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources?

1 code implementation COLING 2016 Christophe Servan, Alexandre Berard, Zied Elloumi, Hervé Blanchon, Laurent Besacier

This paper presents an approach combining lexico-semantic resources and distributed representations of words applied to the evaluation in machine translation (MT).

Machine Translation Translation

Projection Interlingue d'\'Etiquettes pour l'Annotation S\'emantique Non Supervis\'ee (Cross-lingual Annotation Projection for Unsupervised Semantic Tagging)

no code implementations JEPTALNRECITAL 2016 Othman Zennaki, Nasredine Semmar, Laurent Besacier

Dans une pr{\'e}c{\'e}dente contribution, nous avons propos{\'e} une m{\'e}thode pour la construction automatique d{'}un analyseur morpho-syntaxique via une projection interlingue d{'}annotations linguistiques {\`a} partir de corpus parall{\`e}les (m{\'e}thode fond{\'e}e sur les r{\'e}seaux de neurones r{\'e}currents).

MultiVec: a Multilingual and Multilevel Representation Learning Toolkit for NLP

1 code implementation LREC 2016 Alex B{\'e}rard, re, Christophe Servan, Olivier Pietquin, Laurent Besacier

We present MultiVec, a new toolkit for computing continuous representations for text at different granularity levels (word-level or sequences of words).

Document Classification General Classification +2

Utilisation de mesures de confiance pour am\'eliorer le d\'ecodage en traduction de parole

no code implementations JEPTALNRECITAL 2015 Laurent Besacier, Benjamin Lecouteux, Luong Ngoc Quang

Les mesures de confiance au niveau mot (Word Confidence Estimation - WCE) pour la traduction auto- matique (TA) ou pour la reconnaissance automatique de la parole (RAP) attribuent un score de confiance {\`a} chaque mot dans une hypoth{\`e}se de transcription ou de traduction.

Utilisation des r\'eseaux de neurones r\'ecurrents pour la projection interlingue d'\'etiquettes morpho-syntaxiques \`a partir d'un corpus parall\`ele

no code implementations JEPTALNRECITAL 2015 Othman Zennaki, Nasredine Semmar, Laurent Besacier

La construction d{'}outils d{'}analyse linguistique pour les langues faiblement dot{\'e}es est limit{\'e}e, entre autres, par le manque de corpus annot{\'e}s. Dans cet article, nous proposons une m{\'e}thode pour construire automatiquement des outils d{'}analyse via une projection interlingue d{'}annotations linguistiques en utilisant des corpus parall{\`e}les.

Collection of a Large Database of French-English SMT Output Corrections

no code implementations LREC 2012 Marion Potet, Emmanuelle Esperan{\c{c}}a-Rodier, Laurent Besacier, Herv{\'e} Blanchon

We also post-edited 1, 500 gold-standard reference translations (of bilingual parallel corpora generated by professional) and noticed that 72 {\%} of these translations needed to be corrected during post-edition.

Machine Translation Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.