Search Results for author: Emmanuel Dupoux

Found 87 papers, 35 papers with code

A comparison study on patient-psychologist voice diarization

no code implementations • SLPAT (ACL) 2022 • Rachid Riad, Hadrien Titeux, Laurie Lemoine, Justine Montillot, Agnes Sliwinski, Jennifer Bagnou, Xuan Cao, Anne-Catherine Bachoud-Levi, Emmanuel Dupoux

Conversations between a clinician and a patient, in natural conditions, are valuable sources of information for medical follow-up.

Paper
Add Code

Language Evolution with Deep Learning

no code implementations • 18 Mar 2024 • Mathieu Rita, Paul Michel, Rahma Chaabouni, Olivier Pietquin, Emmanuel Dupoux, Florian Strub

Computational modeling plays an essential role in the study of language emergence.

Paper
Add Code

SpiRit-LM: Interleaved Spoken and Written Language Model

no code implementations • 8 Feb 2024 • Tu Anh Nguyen, Benjamin Muller, Bokai Yu, Marta R. Costa-Jussa, Maha Elbayad, Sravya Popuri, Paul-Ambroise Duquenne, Robin Algayres, Ruslan Mavlyutov, Itai Gat, Gabriel Synnaeve, Juan Pino, Benoit Sagot, Emmanuel Dupoux

We introduce SPIRIT-LM, a foundation multimodal language model that freely mixes text and speech.

Language Modelling

Paper
Add Code

EmphAssess : a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models

1 code implementation • 21 Dec 2023 • Maureen de Seyssel, Antony D'Avirro, Adina Williams, Emmanuel Dupoux

We introduce EmphAssess, a prosodic benchmark designed to evaluate the capability of speech-to-speech models to encode and reproduce prosodic emphasis.

Resynthesis Speech-to-Speech Translation +1

Paper
Code

WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models

1 code implementation • 27 Nov 2023 • Youssef Benchekroun, Megi Dervishi, Mark Ibrahim, Jean-Baptiste Gaya, Xavier Martinet, Grégoire Mialon, Thomas Scialom, Emmanuel Dupoux, Dieuwke Hupkes, Pascal Vincent

We propose WorldSense, a benchmark designed to assess the extent to which LLMs are consistently able to sustain tacit world models, by testing how they draw simple inferences from descriptions of simple arrangements of entities.

In-Context Learning

Paper
Code

XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words

no code implementations • 8 Oct 2023 • Robin Algayres, Pablo Diego-Simon, Benoit Sagot, Emmanuel Dupoux

Due to the absence of explicit word boundaries in the speech stream, the task of segmenting spoken sentences into word units without text supervision is particularly challenging.

Paper
Add Code

Generative Spoken Language Model based on continuous word-sized audio tokens

no code implementations • 8 Oct 2023 • Robin Algayres, Yossi Adi, Tu Anh Nguyen, Jade Copet, Gabriel Synnaeve, Benoit Sagot, Emmanuel Dupoux

In NLP, text language models based on words or subwords are known to outperform their character-based counterparts.

Language Modelling

Paper
Add Code

Low-Resource Self-Supervised Learning with SSL-Enhanced TTS

no code implementations • 29 Sep 2023 • Po-chun Hsu, Ali Elkahky, Wei-Ning Hsu, Yossi Adi, Tu Anh Nguyen, Jade Copet, Emmanuel Dupoux, Hung-Yi Lee, Abdelrahman Mohamed

Self-supervised learning (SSL) techniques have achieved remarkable results in various speech processing tasks.

Self-Supervised Learning

Paper
Add Code

EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis

no code implementations • 10 Aug 2023 • Tu Anh Nguyen, Wei-Ning Hsu, Antony D'Avirro, Bowen Shi, Itai Gat, Maryam Fazel-Zarani, Tal Remez, Jade Copet, Gabriel Synnaeve, Michael Hassid, Felix Kreuk, Yossi Adi, Emmanuel Dupoux

Recent work has shown that it is possible to resynthesize high-quality speech based, not on text, but on low bitrate discrete units that have been learned in a self-supervised fashion and can therefore capture expressive aspects of speech that are hard to transcribe (prosody, voice styles, non-verbal vocalization).

Resynthesis Speech Synthesis

Paper
Add Code

BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models

1 code implementation • 2 Jun 2023 • Marvin Lavechin, Yaya Sy, Hadrien Titeux, María Andrea Cruz Blandón, Okko Räsänen, Hervé Bredin, Emmanuel Dupoux, Alejandrina Cristia

Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels.

Benchmarking Language Acquisition

Paper
Code

Textually Pretrained Speech Language Models

1 code implementation • NeurIPS 2023 • Michael Hassid, Tal Remez, Tu Anh Nguyen, Itai Gat, Alexis Conneau, Felix Kreuk, Jade Copet, Alexandre Defossez, Gabriel Synnaeve, Emmanuel Dupoux, Roy Schwartz, Yossi Adi

In this work, we propose TWIST, a method for training SpeechLMs using a warm-start from a pretrained textual language models.

Paper
Code

ProsAudit, a prosodic benchmark for self-supervised speech models

no code implementations • 23 Feb 2023 • Maureen de Seyssel, Marvin Lavechin, Hadrien Titeux, Arthur Thomas, Gwendal Virlet, Andrea Santos Revilla, Guillaume Wisniewski, Bogdan Ludusan, Emmanuel Dupoux

In the lexical task, the model needs to correctly distinguish between pauses inserted between words and within words.

Self-Supervised Learning

Paper
Add Code

Introducing topography in convolutional neural networks

1 code implementation • 28 Oct 2022 • Maxime Poli, Emmanuel Dupoux, Rachid Riad

Thus, in this work, inspired by the neuroscience literature, we proposed a new topographic inductive bias in Convolutional Neural Networks (CNNs).

Inductive Bias

Paper
Code

Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge

no code implementations • 27 Oct 2022 • Ewan Dunbar, Nicolas Hamilakis, Emmanuel Dupoux

Recent progress in self-supervised or unsupervised machine learning has opened the possibility of building a full speech processing system from raw audio without using any textual representations or expert labels such as phonemes, dictionaries or parse trees.

Acoustic Unit Discovery Language Modelling +1

Paper
Add Code

Evaluating context-invariance in unsupervised speech representations

1 code implementation • 27 Oct 2022 • Mark Hallap, Emmanuel Dupoux, Ewan Dunbar

Unsupervised speech representations have taken off, with benchmarks (SUPERB, ZeroSpeech) demonstrating major progress on semi-supervised speech recognition, speech synthesis, and speech-only language modelling.

Language Modelling speech-recognition +2

Paper
Code

Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation

1 code implementation • 24 Oct 2022 • Marvin Lavechin, Marianne Métais, Hadrien Titeux, Alodie Boissonnet, Jade Copet, Morgane Rivière, Elika Bergelson, Alejandrina Cristia, Emmanuel Dupoux, Hervé Bredin

Most automatic speech processing systems register degraded performance when applied to noisy or reverberant speech.

Action Detection Activity Detection +3

108

Paper
Code

Are word boundaries useful for unsupervised language learning?

no code implementations • 6 Oct 2022 • Tu Anh Nguyen, Maureen de Seyssel, Robin Algayres, Patricia Roze, Ewan Dunbar, Emmanuel Dupoux

However, word boundary information may be absent or unreliable in the case of speech input (word boundaries are not marked explicitly in the speech stream).

Paper
Add Code

Emergent Communication: Generalization and Overfitting in Lewis Games

1 code implementation • 30 Sep 2022 • Mathieu Rita, Corentin Tallec, Paul Michel, Jean-bastien Grill, Olivier Pietquin, Emmanuel Dupoux, Florian Strub

Lewis signaling games are a class of simple communication games for simulating the emergence of language.

Paper
Code

Augmentation Invariant Discrete Representation for Generative Spoken Language Modeling

no code implementations • 30 Sep 2022 • Itai Gat, Felix Kreuk, Tu Anh Nguyen, Ann Lee, Jade Copet, Gabriel Synnaeve, Emmanuel Dupoux, Yossi Adi

This work focuses on improving the robustness of discrete input representations for generative spoken language modeling.

Language Modelling Speech-to-Speech Translation

Paper
Add Code

STOP: A dataset for Spoken Task Oriented Semantic Parsing

1 code implementation • 29 Jun 2022 • Paden Tomasello, Akshat Shrivastava, Daniel Lazar, Po-chun Hsu, Duc Le, Adithya Sagar, Ali Elkahky, Jade Copet, Wei-Ning Hsu, Yossi Adi, Robin Algayres, Tu Ahn Nguyen, Emmanuel Dupoux, Luke Zettlemoyer, Abdelrahman Mohamed

Furthermore, in addition to the human-recorded audio, we are releasing a TTS-generated version to benchmark the performance for low-resource domain adaptation of end-to-end SLU systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

29,237

Paper
Code

Is the Language Familiarity Effect gradual? A computational modelling approach

no code implementations • 27 Jun 2022 • Maureen de Seyssel, Guillaume Wisniewski, Emmanuel Dupoux

According to the Language Familiarity Effect (LFE), people are better at discriminating between speakers of their native language.

Paper
Add Code

DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon

no code implementations • 22 Jun 2022 • Robin Algayres, Tristan Ricoul, Julien Karadayi, Hugo Laurençon, Salah Zaiem, Abdelrahman Mohamed, Benoît Sagot, Emmanuel Dupoux

Finding word boundaries in continuous speech is challenging as there is little or no equivalent of a 'space' delimiter between words.

Language Modelling Segmentation +1

Paper
Add Code

Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning

no code implementations • 11 Apr 2022 • Robin Algayres, Adel Nabli, Benoit Sagot, Emmanuel Dupoux

We introduce a simple neural encoder architecture that can be trained using an unsupervised contrastive learning objective which gets its positive samples from data-augmented k-Nearest Neighbors search.

Contrastive Learning

Paper
Add Code

Probing phoneme, language and speaker information in unsupervised speech representations

no code implementations • 30 Mar 2022 • Maureen de Seyssel, Marvin Lavechin, Yossi Adi, Emmanuel Dupoux, Guillaume Wisniewski

Language information, however, is very salient in the bilingual model only, suggesting CPC models learn to discriminate languages when trained on multiple languages.

Language Modelling

Paper
Add Code

Generative Spoken Dialogue Language Modeling

no code implementations • 30 Mar 2022 • Tu Anh Nguyen, Eugene Kharitonov, Jade Copet, Yossi Adi, Wei-Ning Hsu, Ali Elkahky, Paden Tomasello, Robin Algayres, Benoit Sagot, Abdelrahman Mohamed, Emmanuel Dupoux

We introduce dGSLM, the first "textless" model able to generate audio samples of naturalistic spoken dialogues.

Language Modelling

Paper
Add Code

Are discrete units necessary for Spoken Language Modeling?

no code implementations • 11 Mar 2022 • Tu Anh Nguyen, Benoit Sagot, Emmanuel Dupoux

The approach relies first on transforming the audio into a sequence of discrete units (or pseudo-text) and then training a language model directly on such pseudo-text.

Language Modelling

Paper
Add Code

textless-lib: a Library for Textless Spoken Language Processing

1 code implementation • NAACL (ACL) 2022 • Eugene Kharitonov, Jade Copet, Kushal Lakhotia, Tu Anh Nguyen, Paden Tomasello, Ann Lee, Ali Elkahky, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux, Yossi Adi

Textless spoken language processing research aims to extend the applicability of standard NLP toolset onto spoken language and languages with few or no textual resources.

Resynthesis

497

Paper
Code

Towards Interactive Language Modeling

no code implementations • 14 Dec 2021 • Maartje ter Hoeve, Evgeny Kharitonov, Dieuwke Hupkes, Emmanuel Dupoux

As a first contribution we present a road map in which we detail the steps that need to be taken towards interactive language modeling.

Language Acquisition Language Modelling

Paper
Add Code

Shennong: a Python toolbox for audio speech features extraction

1 code implementation • 10 Dec 2021 • Mathieu Bernard, Maxime Poli, Julien Karadayi, Emmanuel Dupoux

After describing the Shennong software architecture, its core components and implemented algorithms, this paper illustrates its use on three applications: a comparison of speech features performances on a phones discrimination task, an analysis of a Vocal Tract Length Normalization model as a function of the speech duration used for training and a comparison of pitch estimation algorithms under various noise conditions.

153

Paper
Code

Textless Speech Emotion Conversion using Discrete and Decomposed Representations

no code implementations • 14 Nov 2021 • Felix Kreuk, Adam Polyak, Jade Copet, Eugene Kharitonov, Tu-Anh Nguyen, Morgane Rivière, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux, Yossi Adi

We use a decomposition of the speech signal into discrete learned representations, consisting of phonetic-content units, prosodic features, speaker, and emotion.

Paper
Add Code

Textless Speech Emotion Conversion using Decomposed & Discrete Representations

no code implementations • arXiv 2021 • Felix Kreuk, Adam Polyak, Jade Copet, Eugene Kharitonov, Tu-Anh Nguyen, Morgane Rivière, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux, Yossi Adi

We decompose speech into discrete and disentangled learned representations, consisting of content units, F0, speaker, and emotion.

Paper
Add Code

Text-Free Prosody-Aware Generative Spoken Language Modeling

1 code implementation • ACL 2022 • Eugene Kharitonov, Ann Lee, Adam Polyak, Yossi Adi, Jade Copet, Kushal Lakhotia, Tu-Anh Nguyen, Morgane Rivière, Abdelrahman Mohamed, Emmanuel Dupoux, Wei-Ning Hsu

Generative Spoken Language Modeling (GSLM) \cite{Lakhotia2021} is the only prior work addressing the generative aspects of speech pre-training, which replaces text with discovered phone-like units for language modeling and shows the ability to generate meaningful novel sentences.

Language Modelling

29,233

Paper
Code

ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition

1 code implementation • 14 Jul 2021 • Afra Alishahia, Grzegorz Chrupała, Alejandrina Cristia, Emmanuel Dupoux, Bertrand Higy, Marvin Lavechin, Okko Räsänen, Chen Yu

We present the visually-grounded language modelling track that was introduced in the Zero-Resource Speech challenge, 2021 edition, 2nd round.

Language Modelling

Paper
Code

The Zero Resource Speech Challenge 2021: Spoken language modelling

no code implementations • 29 Apr 2021 • Ewan Dunbar, Mathieu Bernard, Nicolas Hamilakis, Tu Anh Nguyen, Maureen de Seyssel, Patricia Rozé, Morgane Rivière, Eugene Kharitonov, Emmanuel Dupoux

We present the Zero Resource Speech Challenge 2021, which asks participants to learn a language model directly from audio, without any text or labels.

Language Modelling

Paper
Add Code

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations

2 code implementations • 1 Apr 2021 • Adam Polyak, Yossi Adi, Jade Copet, Eugene Kharitonov, Kushal Lakhotia, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux

We propose using self-supervised discrete representations for the task of speech resynthesis.

Disentanglement Resynthesis +2

353

Paper
Code

Learning spectro-temporal representations of complex sounds with parameterized neural networks

1 code implementation • 12 Mar 2021 • Rachid Riad, Julien Karadayi, Anne-Catherine Bachoud-Lévi, Emmanuel Dupoux

We found out that models based on Learnable STRFs are on par for all tasks with different toplines, and obtain the best performance for Speech Activity Detection.

Action Detection Activity Detection +2

Paper
Code

Generative Spoken Language Modeling from Raw Audio

2 code implementations • 1 Feb 2021 • Kushal Lakhotia, Evgeny Kharitonov, Wei-Ning Hsu, Yossi Adi, Adam Polyak, Benjamin Bolte, Tu-Anh Nguyen, Jade Copet, Alexei Baevski, Adelrahman Mohamed, Emmanuel Dupoux

We introduce Generative Spoken Language Modeling, the task of learning the acoustic and linguistic characteristics of a language from raw audio (no text, no labels), and a set of metrics to automatically evaluate the learned representations at acoustic and linguistic levels for both encoding and generation.

Ranked #1 on Resynthesis on LibriSpeech

Language Modelling Resynthesis

29,233

Paper
Code

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

1 code implementation • ACL 2021 • Changhan Wang, Morgane Rivière, Ann Lee, Anne Wu, Chaitanya Talnikar, Daniel Haziza, Mary Williamson, Juan Pino, Emmanuel Dupoux

We introduce VoxPopuli, a large-scale multilingual corpus providing 100K hours of unlabelled speech data in 23 languages.

Ranked #3 on Speech Recognition on Common Voice French (using extra training data)

Representation Learning speech-recognition +1

491

Paper
Code

The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling

2 code implementations • 23 Nov 2020 • Tu Anh Nguyen, Maureen de Seyssel, Patricia Rozé, Morgane Rivière, Evgeny Kharitonov, Alexei Baevski, Ewan Dunbar, Emmanuel Dupoux

We introduce a new unsupervised task, spoken language modeling: the learning of linguistic representations from raw audio signals without any labels, along with the Zero Resource Speech Benchmark 2021: a suite of 4 black-box, zero-shot metrics probing for the quality of the learned models at 4 linguistic levels: phonetics, lexicon, syntax and semantics.

Clustering Language Modelling +1

Paper
Code

``LazImpa'': Lazy and Impatient neural agents learn to communicate efficiently

no code implementations • CONLL 2020 • Mathieu Rita, Rahma Chaabouni, Emmanuel Dupoux

Previous work has shown that artificial neural agents naturally develop surprisingly non-efficient codes.

Paper
Add Code

Comparison of Speaker Role Recognition and Speaker Enrollment Protocol for conversational Clinical Interviews

no code implementations • 30 Oct 2020 • Rachid Riad, Hadrien Titeux, Laurie Lemoine, Justine Montillot, Agnes Sliwinski, Jennifer Hamet Bagnou, Xuan Nga Cao, Anne-Catherine Bachoud-Lévi, Emmanuel Dupoux

Here, we proposed a split of the data that allows conducting a comparative evaluation of speaker role recognition and speaker enrollment methods to solve this task.

Paper
Add Code

The Zero Resource Speech Challenge 2020: Discovering discrete subword and word units

no code implementations • 12 Oct 2020 • Ewan Dunbar, Julien Karadayi, Mathieu Bernard, Xuan-Nga Cao, Robin Algayres, Lucas Ondel, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux

We present the Zero Resource Speech Challenge 2020, which aims at learning speech representations from raw audio signals without any labels.

Speech Synthesis

Paper
Add Code

Analogies minus analogy test: measuring regularities in word embeddings

1 code implementation • CONLL 2020 • Louis Fournier, Emmanuel Dupoux, Ewan Dunbar

Vector space models of words have long been claimed to capture linguistic regularities as simple vector translations, but problems have been raised with this claim.

Word Embeddings

Paper
Code

"LazImpa": Lazy and Impatient neural agents learn to communicate efficiently

1 code implementation • 5 Oct 2020 • Mathieu Rita, Rahma Chaabouni, Emmanuel Dupoux

Previous work has shown that artificial neural agents naturally develop surprisingly non-efficient codes.

Paper
Code

Evaluating the reliability of acoustic speech embeddings

no code implementations • 27 Jul 2020 • Robin Algayres, Mohamed Salah Zaiem, Benoit Sagot, Emmanuel Dupoux

However, there is currently no clear methodology to compare or optimise the quality of these embeddings in a task-neutral way.

Information Retrieval Retrieval

Paper
Add Code

Data Augmenting Contrastive Learning of Speech Representations in the Time Domain

1 code implementation • 2 Jul 2020 • Eugene Kharitonov, Morgane Rivière, Gabriel Synnaeve, Lior Wolf, Pierre-Emmanuel Mazaré, Matthijs Douze, Emmanuel Dupoux

Contrastive Predictive Coding (CPC), based on predicting future segments of speech based on past segments is emerging as a powerful algorithm for representation learning of speech signal.

Contrastive Learning Data Augmentation +1

626

Paper
Code

Vocal markers from sustained phonation in Huntington's Disease

1 code implementation • 9 Jun 2020 • Rachid Riad, Hadrien Titeux, Laurie Lemoine, Justine Montillot, Jennifer Hamet Bagnou, Xuan Nga Cao, Emmanuel Dupoux, Anne-Catherine Bachoud-Lévi

According to our regression results, Phonatory features are suitable for the predictions of clinical performance in Huntington's Disease.

regression

Paper
Code

An open-source voice type classifier for child-centered daylong recordings

1 code implementation • 26 May 2020 • Marvin Lavechin, Ruben Bousbib, Hervé Bredin, Emmanuel Dupoux, Alejandrina Cristia

Spontaneous conversations in real-world settings such as those found in child-centered recordings have been shown to be amongst the most challenging audio files to process.

Language Acquisition Vocal Bursts Type Prediction

Paper
Code

Occlusion resistant learning of intuitive physics from videos

no code implementations • 30 Apr 2020 • Ronan Riochet, Josef Sivic, Ivan Laptev, Emmanuel Dupoux

In this work we propose a probabilistic formulation of learning intuitive physics in 3D scenes with significant inter-object occlusions.

Object

Paper
Add Code

Compositionality and Generalization in Emergent Languages

1 code implementation • ACL 2020 • Rahma Chaabouni, Eugene Kharitonov, Diane Bouchacourt, Emmanuel Dupoux, Marco Baroni

Third, while compositionality is not necessary for generalization, it provides an advantage in terms of language transmission: The more compositional a language is, the more easily it will be picked up by new learners, even when the latter differ in architecture from the original agents.

Disentanglement

280

Paper
Code

Seshat: A tool for managing and verifying annotation campaigns of audio data

1 code implementation • LREC 2020 • Hadrien Titeux, Rachid Riad, Xuan-Nga Cao, Nicolas Hamilakis, Kris Madden, Alejandrina Cristia, Anne-Catherine Bachoud-Lévi, Emmanuel Dupoux

We introduce Seshat, a new, simple and open-source software to efficiently manage annotations of speech corpora.

Paper
Code

Identification of primary and collateral tracks in stuttered speech

no code implementations • LREC 2020 • Rachid Riad, Anne-Catherine Bachoud-Lévi, Frank Rudzicz, Emmanuel Dupoux

Here, we introduce a new evaluation framework for disfluency detection inspired by the clinical and NLP perspective together with the theory of performance from \cite{clark1996using} which distinguishes between primary and collateral tracks.

Paper
Add Code

Unsupervised pretraining transfers well across languages

3 code implementations • 7 Feb 2020 • Morgane Rivière, Armand Joulin, Pierre-Emmanuel Mazaré, Emmanuel Dupoux

Cross-lingual and multi-lingual training of Automatic Speech Recognition (ASR) has been extensively investigated in the supervised setting.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

341

Paper
Code

Libri-Light: A Benchmark for ASR with Limited or No Supervision

2 code implementations • 17 Dec 2019 • Jacob Kahn, Morgane Rivière, Weiyi Zheng, Evgeny Kharitonov, Qiantong Xu, Pierre-Emmanuel Mazaré, Julien Karadayi, Vitaliy Liptchinsky, Ronan Collobert, Christian Fuegen, Tatiana Likhomanenko, Gabriel Synnaeve, Armand Joulin, Abdel-rahman Mohamed, Emmanuel Dupoux

Additionally, we provide baseline systems and evaluation metrics working under three settings: (1) the zero resource/unsupervised setting (ABX), (2) the semi-supervised setting (PER, CER) and (3) the distant supervision setting (WER).

Ranked #1 on Speech Recognition on Libri-Light test-other (ABX-within metric)

speech-recognition Speech Recognition

446

Paper
Code

Speaker detection in the wild: Lessons learned from JSALT 2019

1 code implementation • 2 Dec 2019 • Paola Garcia, Jesus Villalba, Herve Bredin, Jun Du, Diego Castan, Alejandrina Cristia, Latane Bullock, Ling Guo, Koji Okabe, Phani Sankar Nidadavolu, Saurabh Kataria, Sizhu Chen, Leo Galmant, Marvin Lavechin, Lei Sun, Marie-Philippe Gill, Bar Ben-Yair, Sajjad Abdoli, Xin Wang, Wassim Bouaziz, Hadrien Titeux, Emmanuel Dupoux, Kong Aik Lee, Najim Dehak

This paper presents the problems and solutions addressed at the JSALT workshop when using a single microphone for speaker detection in adverse scenarios.

Audio and Speech Processing Sound

Paper
Code

Neural language modeling of free word order argument structure

no code implementations • 30 Nov 2019 • Charlotte Rochereau, Benoît Sagot, Emmanuel Dupoux

Neural language models trained with a predictive or masked objective have proven successful at capturing short and long distance syntactic dependencies.

Language Modelling

Paper
Add Code

SyntaxFest 2019 Invited talk - Inductive biases and language emergence in communicative agents

no code implementations • WS 2019 • Emmanuel Dupoux

Paper
Add Code

Word-order biases in deep-agent emergent communication

1 code implementation • ACL 2019 • Rahma Chaabouni, Eugene Kharitonov, Alessandro Lazaric, Emmanuel Dupoux, Marco Baroni

We train models to communicate about paths in a simple gridworld, using miniature languages that reflect or violate various natural language trends, such as the tendency to avoid redundancy or to minimize long-distance dependencies.

Paper
Code

Anti-efficient encoding in emergent communication

1 code implementation • NeurIPS 2019 • Rahma Chaabouni, Eugene Kharitonov, Emmanuel Dupoux, Marco Baroni

Despite renewed interest in emergent language simulations with neural networks, little is known about the basic properties of the induced code, and how they compare to human language.

280

Paper
Code

The Zero Resource Speech Challenge 2019: TTS without T

no code implementations • 25 Apr 2019 • Ewan Dunbar, Robin Algayres, Julien Karadayi, Mathieu Bernard, Juan Benjumea, Xuan-Nga Cao, Lucie Miskic, Charlotte Dugrain, Lucas Ondel, Alan W. black, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux

We present the Zero Resource Speech Challenge 2019, which proposes to build a speech synthesizer without any text or phonetic labels: hence, TTS without T (text-to-speech without text).

Paper
Add Code

End-to-End Speech Recognition From the Raw Waveform

1 code implementation • 19 Jun 2018 • Neil Zeghidour, Nicolas Usunier, Gabriel Synnaeve, Ronan Collobert, Emmanuel Dupoux

In this paper, we study end-to-end systems trained directly from the raw waveform, building on two alternatives for trainable replacements of mel-filterbanks that use a convolutional architecture.

speech-recognition Speech Recognition

Paper
Code

BabyCloud, a Technological Platform for Parents and Researchers

no code implementations • LREC 2018 • Xu{\^a}n-Nga Cao, Cyrille Dakhlia, Patricia Del Carmen, Mohamed-Amine Jaouani, Malik Ould-Arbi, Emmanuel Dupoux

Language Acquisition

Paper
Add Code

Sampling strategies in Siamese Networks for unsupervised speech representation learning

2 code implementations • 30 Apr 2018 • Rachid Riad, Corentin Dancette, Julien Karadayi, Neil Zeghidour, Thomas Schatz, Emmanuel Dupoux

We apply these results to pairs of words discovered using an unsupervised algorithm and show an improvement on state-of-the-art in unsupervised representation learning using siamese networks.

Representation Learning

Paper
Code

IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning

1 code implementation • 20 Mar 2018 • Ronan Riochet, Mario Ynocente Castro, Mathieu Bernard, Adam Lerer, Rob Fergus, Véronique Izard, Emmanuel Dupoux

In order to reach human performance on complexvisual tasks, artificial systems need to incorporate a sig-nificant amount of understanding of the world in termsof macroscopic objects, movements, forces, etc.

Paper
Code

Bayesian Models for Unit Discovery on a Very Low Resource Language

no code implementations • 16 Feb 2018 • Lucas Ondel, Pierre Godard, Laurent Besacier, Elin Larsen, Mark Hasegawa-Johnson, Odette Scharenborg, Emmanuel Dupoux, Lukas Burget, François Yvon, Sanjeev Khudanpur

Developing speech technologies for low-resource languages has become a very active research field over the last decade.

Acoustic Unit Discovery Segmentation

Paper
Add Code

Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop

no code implementations • 14 Feb 2018 • Odette Scharenborg, Laurent Besacier, Alan Black, Mark Hasegawa-Johnson, Florian Metze, Graham Neubig, Sebastian Stueker, Pierre Godard, Markus Mueller, Lucas Ondel, Shruti Palaskar, Philip Arthur, Francesco Ciannella, Mingxing Du, Elin Larsen, Danny Merkx, Rachid Riad, Liming Wang, Emmanuel Dupoux

We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding the discovery of linguistic units (subwords and words) in a language without orthography.

Paper
Add Code

Are words easier to learn from infant- than adult-directed speech? A quantitative corpus-based investigation

no code implementations • 23 Dec 2017 • Adriana Guevara-Rukoz, Alejandrina Cristia, Bogdan Ludusan, Roland Thiollière, Andrew Martin, Reiko Mazuka, Emmanuel Dupoux

At the acoustic level we show that, as has been documented before for phonemes, the realizations of words are more variable and less discriminable in IDS than in ADS.

Paper
Add Code

The Zero Resource Speech Challenge 2017

no code implementations • 12 Dec 2017 • Ewan Dunbar, Xuan Nga Cao, Juan Benjumea, Julien Karadayi, Mathieu Bernard, Laurent Besacier, Xavier Anguera, Emmanuel Dupoux

We describe a new challenge aimed at discovering subword and word units from raw speech.

Paper
Add Code

Learning Filterbanks from Raw Speech for Phone Recognition

2 code implementations • 3 Nov 2017 • Neil Zeghidour, Nicolas Usunier, Iasonas Kokkinos, Thomas Schatz, Gabriel Synnaeve, Emmanuel Dupoux

We train a bank of complex filters that operates on the raw waveform and is fed into a convolutional neural network for end-to-end phone recognition.

473

Paper
Code

The Role of Prosody and Speech Register in Word Segmentation: A Computational Modelling Perspective

no code implementations • ACL 2017 • Bogdan Ludusan, Reiko Mazuka, Mathieu Bernard, Alej Cristia, rina, Emmanuel Dupoux

This study explores the role of speech register and prosody for the task of word segmentation.

Language Acquisition Segmentation

Paper
Add Code

Learning weakly supervised multimodal phoneme embeddings

no code implementations • 23 Apr 2017 • Rahma Chaabouni, Ewan Dunbar, Neil Zeghidour, Emmanuel Dupoux

Recent works have explored deep architectures for learning multimodal speech representation (e. g. audio and images, articulation and audio) in a supervised way.

Multi-Task Learning

Paper
Add Code

Comparing Character-level Neural Language Models Using a Lexical Decision Task

no code implementations • EACL 2017 • Ga{\"e}l Le Godais, Tal Linzen, Emmanuel Dupoux

What is the information captured by neural network models of language?

Paper
Add Code

Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

5 code implementations • TACL 2016 • Tal Linzen, Emmanuel Dupoux, Yoav Goldberg

The success of long short-term memory (LSTM) neural networks in language processing is typically attributed to their ability to capture long-distance statistical regularities.

Language Modelling

150

Paper
Code

Quantificational features in distributional word representations

no code implementations • SEMEVAL 2016 • Tal Linzen, Emmanuel Dupoux, Benjamin Spector

Paper
Add Code

Blind phoneme segmentation with temporal prediction errors

no code implementations • ACL 2017 • Paul Michel, Okko Räsänen, Roland Thiollière, Emmanuel Dupoux

Phonemic segmentation of speech is a critical step of speech recognition systems.

speech-recognition Speech Recognition

Paper
Add Code

Cognitive Science in the era of Artificial Intelligence: A roadmap for reverse-engineering the infant language-learner

no code implementations • 29 Jul 2016 • Emmanuel Dupoux

The project of 'reverse engineering' language development, i. e., of building an effective system that mimics infant's achievements appears therefore to be within reach.

BIG-bench Machine Learning Privacy Preserving

Paper
Add Code

Motif discovery in infant- and adult-directed speech

no code implementations • WS 2015 • Bogdan Ludusan, Am Seidl, a, Emmanuel Dupoux, Alex Cristia

Language Acquisition

Paper
Add Code

Prosodic boundary information helps unsupervised word segmentation

no code implementations • HLT 2015 • Gabriel Synnaeve, Emmanuel Dupoux, Bogdan Ludusan

Boundary Detection Language Acquisition +2

Paper
Add Code

Sign constraints on feature weights improve a joint model of word segmentation and phonology

no code implementations • HLT 2015 • Mark Johnson, Emmanuel Dupoux, Joe Pater, Robert Staubs

Segmentation

Paper
Add Code

Weakly Supervised Multi-Embeddings Learning of Acoustic Models

no code implementations • 20 Dec 2014 • Gabriel Synnaeve, Emmanuel Dupoux

We trained a Siamese network with multi-task same/different information on a speech dataset, and found that it was possible to share a network for both tasks without a loss in performance.

Paper
Add Code

Unsupervised Word Segmentation in Context

no code implementations • COLING 2014 • Gabriel Synnaeve, Isabelle Dautriche, Benjamin B{\"o}rschinger, Mark Johnson, Emmanuel Dupoux

Language Acquisition Segmentation

Paper
Add Code

A Rudimentary Lexicon and Semantics Help Bootstrap Phoneme Acquisition

no code implementations • WS 2014 • Abdellah Fourtassi, Emmanuel Dupoux

Language Acquisition

Paper
Add Code

Exploring the Relative Role of Bottom-up and Top-down Information in Phoneme Learning

no code implementations • ACL 2014 • Abdellah Fourtassi, Thomas Schatz, Balakrishnan Varadarajan, Emmanuel Dupoux

Language Acquisition

Paper
Add Code

Modelling function words improves unsupervised word segmentation

no code implementations • ACL 2014 • Mark Johnson, Anne Christophe, Emmanuel Dupoux, Katherine Demuth

Language Acquisition

Paper
Add Code

Bridging the gap between speech technology and natural language processing: an evaluation toolbox for term discovery systems

no code implementations • LREC 2014 • Bogdan Ludusan, Maarten Versteegh, Aren Jansen, Guillaume Gravier, Xuan-Nga Cao, Mark Johnson, Emmanuel Dupoux

The unsupervised discovery of linguistic terms from either continuous phoneme transcriptions or from raw speech has seen an increasing interest in the past years both from a theoretical and a practical standpoint.

Language Acquisition

Paper
Add Code

A corpus-based evaluation method for Distributional Semantic Models

no code implementations • ACL 2013 • Abdellah Fourtassi, Emmanuel Dupoux

Semantic Textual Similarity

Paper
Add Code

Why is English so easy to segment?

no code implementations • WS 2013 • Abdellah Fourtassi, Benjamin B{\"o}rschinger, Mark Johnson, Emmanuel Dupoux

Boundary Detection Language Acquisition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.