no code implementations • 23 Apr 2017 • Rahma Chaabouni, Ewan Dunbar, Neil Zeghidour, Emmanuel Dupoux
Recent works have explored deep architectures for learning multimodal speech representation (e. g. audio and images, articulation and audio) in a supervised way.
no code implementations • 12 Dec 2017 • Ewan Dunbar, Xuan Nga Cao, Juan Benjumea, Julien Karadayi, Mathieu Bernard, Laurent Besacier, Xavier Anguera, Emmanuel Dupoux
We describe a new challenge aimed at discovering subword and word units from raw speech.
no code implementations • 20 Dec 2018 • R. Thomas McCoy, Tal Linzen, Ewan Dunbar, Paul Smolensky
Recurrent neural networks (RNNs) can learn continuous vector representations of symbolic structures such as sequences and sentences; these representations often exhibit linear regularities (analogies).
no code implementations • 25 Apr 2019 • Ewan Dunbar, Robin Algayres, Julien Karadayi, Mathieu Bernard, Juan Benjumea, Xuan-Nga Cao, Lucie Miskic, Charlotte Dugrain, Lucas Ondel, Alan W. black, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux
We present the Zero Resource Speech Challenge 2019, which proposes to build a speech synthesizer without any text or phonetic labels: hence, TTS without T (text-to-speech without text).
1 code implementation • ICLR 2019 • R. Thomas McCoy, Tal Linzen, Ewan Dunbar, Paul Smolensky
Recurrent neural networks (RNNs) can learn continuous vector representations of symbolic structures such as sequences and sentences; these representations often exhibit linear regularities (analogies).
no code implementations • 7 May 2020 • Juliette Millet, Ewan Dunbar
We show that DeepSpeech, a standard English speech recognizer, is more specialized on English phoneme discrimination than English listeners, and is poorly correlated with their behaviour, even though it yields a low error on the decision task given to humans.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • CONLL 2020 • Louis Fournier, Emmanuel Dupoux, Ewan Dunbar
Vector space models of words have long been claimed to capture linguistic regularities as simple vector translations, but problems have been raised with this claim.
no code implementations • 12 Oct 2020 • Ewan Dunbar, Julien Karadayi, Mathieu Bernard, Xuan-Nga Cao, Robin Algayres, Lucas Ondel, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux
We present the Zero Resource Speech Challenge 2020, which aims at learning speech representations from raw audio signals without any labels.
1 code implementation • 12 Oct 2020 • Juliette Millet, Ewan Dunbar
In this paper, we present a data set and methods to compare speech processing models and human behaviour on a phone discrimination task.
2 code implementations • 23 Nov 2020 • Tu Anh Nguyen, Maureen de Seyssel, Patricia Rozé, Morgane Rivière, Evgeny Kharitonov, Alexei Baevski, Ewan Dunbar, Emmanuel Dupoux
We introduce a new unsupervised task, spoken language modeling: the learning of linguistic representations from raw audio signals without any labels, along with the Zero Resource Speech Benchmark 2021: a suite of 4 black-box, zero-shot metrics probing for the quality of the learned models at 4 linguistic levels: phonetics, lexicon, syntax and semantics.
1 code implementation • EACL 2021 • Louis Fournier, Ewan Dunbar
Many types of distributional word embeddings (weakly) encode linguistic regularities as directions (the difference between "jump" and "jumped" will be in a similar direction to that of "walk" and "walked," and so on).
no code implementations • 29 Apr 2021 • Ewan Dunbar, Mathieu Bernard, Nicolas Hamilakis, Tu Anh Nguyen, Maureen de Seyssel, Patricia Rozé, Morgane Rivière, Eugene Kharitonov, Emmanuel Dupoux
We present the Zero Resource Speech Challenge 2021, which asks participants to learn a language model directly from audio, without any text or labels.
no code implementations • ACL 2022 • Juliette Millet, Ewan Dunbar
We show that the CPC model shows a small native language effect, but that wav2vec 2. 0 and HuBERT seem to develop a universal speech perception space which is not language specific.
no code implementations • CoNLL (EMNLP) 2021 • Juliette Millet, Ioana Chitoran, Ewan Dunbar
Our native language influences the way we perceive speech sounds, affecting our ability to discriminate non-native sounds.
no code implementations • 3 Jun 2022 • Juliette Millet, Charlotte Caucheteux, Pierre Orhan, Yves Boubenec, Alexandre Gramfort, Ewan Dunbar, Christophe Pallier, Jean-Remi King
These elements, resulting from the largest neuroimaging benchmark to date, show how self-supervised learning can account for a rich organization of speech processing in the brain, and thus delineate a path to identify the laws of language acquisition which shape the human brain.
no code implementations • 6 Oct 2022 • Tu Anh Nguyen, Maureen de Seyssel, Robin Algayres, Patricia Roze, Ewan Dunbar, Emmanuel Dupoux
However, word boundary information may be absent or unreliable in the case of speech input (word boundaries are not marked explicitly in the speech stream).
1 code implementation • 27 Oct 2022 • Mark Hallap, Emmanuel Dupoux, Ewan Dunbar
Unsupervised speech representations have taken off, with benchmarks (SUPERB, ZeroSpeech) demonstrating major progress on semi-supervised speech recognition, speech synthesis, and speech-only language modelling.
no code implementations • 27 Oct 2022 • Ewan Dunbar, Nicolas Hamilakis, Emmanuel Dupoux
Recent progress in self-supervised or unsupervised machine learning has opened the possibility of building a full speech processing system from raw audio without using any textual representations or expert labels such as phonemes, dictionaries or parse trees.
1 code implementation • 4 Oct 2023 • Kuan-Po Huang, Chih-Kai Yang, Yu-Kuan Fu, Ewan Dunbar, Hung-Yi Lee
We introduce a new zero resource code-switched speech benchmark designed to directly assess the code-switching capabilities of self-supervised speech encoders.
1 code implementation • 3 Dec 2023 • Sean Robertson, Ewan Dunbar
It has been generally assumed in the automatic speech recognition (ASR) literature that it is better for models to have access to wider context windows.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2