Search Results for author: Ewan Dunbar

Found 21 papers, 8 papers with code

The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling

2 code implementations • 23 Nov 2020 • Tu Anh Nguyen, Maureen de Seyssel, Patricia Rozé, Morgane Rivière, Evgeny Kharitonov, Alexei Baevski, Ewan Dunbar, Emmanuel Dupoux

We introduce a new unsupervised task, spoken language modeling: the learning of linguistic representations from raw audio signals without any labels, along with the Zero Resource Speech Benchmark 2021: a suite of 4 black-box, zero-shot metrics probing for the quality of the learned models at 4 linguistic levels: phonetics, lexicon, syntax and semantics.

Clustering Language Modelling +1

Paper
Code

RNNs implicitly implement tensor-product representations

1 code implementation • ICLR 2019 • R. Thomas McCoy, Tal Linzen, Ewan Dunbar, Paul Smolensky

Recurrent neural networks (RNNs) can learn continuous vector representations of symbolic structures such as sequences and sentences; these representations often exhibit linear regularities (analogies).

Representation Learning Sentence

Paper
Code

Analogies minus analogy test: measuring regularities in word embeddings

1 code implementation • CONLL 2020 • Louis Fournier, Emmanuel Dupoux, Ewan Dunbar

Vector space models of words have long been claimed to capture linguistic regularities as simple vector translations, but problems have been raised with this claim.

Word Embeddings

Paper
Code

Paraphrases do not explain word analogies

1 code implementation • EACL 2021 • Louis Fournier, Ewan Dunbar

Many types of distributional word embeddings (weakly) encode linguistic regularities as directions (the difference between "jump" and "jumped" will be in a similar direction to that of "walk" and "walked," and so on).

Word Embeddings

Paper
Code

Perceptimatic: A human speech perception benchmark for unsupervised subword modelling

1 code implementation • 12 Oct 2020 • Juliette Millet, Ewan Dunbar

In this paper, we present a data set and methods to compare speech processing models and human behaviour on a phone discrimination task.

Paper
Code

Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages

1 code implementation • 4 Oct 2023 • Kuan-Po Huang, Chih-Kai Yang, Yu-Kuan Fu, Ewan Dunbar, Hung-Yi Lee

We introduce a new zero resource code-switched speech benchmark designed to directly assess the code-switching capabilities of self-supervised speech encoders.

Language Modelling

Paper
Code

The Zero Resource Speech Challenge 2017

no code implementations • 12 Dec 2017 • Ewan Dunbar, Xuan Nga Cao, Juan Benjumea, Julien Karadayi, Mathieu Bernard, Laurent Besacier, Xavier Anguera, Emmanuel Dupoux

We describe a new challenge aimed at discovering subword and word units from raw speech.

Paper
Add Code

Learning weakly supervised multimodal phoneme embeddings

no code implementations • 23 Apr 2017 • Rahma Chaabouni, Ewan Dunbar, Neil Zeghidour, Emmanuel Dupoux

Recent works have explored deep architectures for learning multimodal speech representation (e. g. audio and images, articulation and audio) in a supervised way.

Multi-Task Learning

Paper
Add Code

RNNs Implicitly Implement Tensor Product Representations

no code implementations • 20 Dec 2018 • R. Thomas McCoy, Tal Linzen, Ewan Dunbar, Paul Smolensky

Representation Learning Sentence

Paper
Add Code

The Zero Resource Speech Challenge 2019: TTS without T

no code implementations • 25 Apr 2019 • Ewan Dunbar, Robin Algayres, Julien Karadayi, Mathieu Bernard, Juan Benjumea, Xuan-Nga Cao, Lucie Miskic, Charlotte Dugrain, Lucas Ondel, Alan W. black, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux

We present the Zero Resource Speech Challenge 2019, which proposes to build a speech synthesizer without any text or phonetic labels: hence, TTS without T (text-to-speech without text).

Paper
Add Code

The Perceptimatic English Benchmark for Speech Perception Models

no code implementations • 7 May 2020 • Juliette Millet, Ewan Dunbar

We show that DeepSpeech, a standard English speech recognizer, is more specialized on English phoneme discrimination than English listeners, and is poorly correlated with their behaviour, even though it yields a low error on the decision task given to humans.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

The Zero Resource Speech Challenge 2020: Discovering discrete subword and word units

no code implementations • 12 Oct 2020 • Ewan Dunbar, Julien Karadayi, Mathieu Bernard, Xuan-Nga Cao, Robin Algayres, Lucas Ondel, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux

We present the Zero Resource Speech Challenge 2020, which aims at learning speech representations from raw audio signals without any labels.

Speech Synthesis

Paper
Add Code

The Zero Resource Speech Challenge 2021: Spoken language modelling

no code implementations • 29 Apr 2021 • Ewan Dunbar, Mathieu Bernard, Nicolas Hamilakis, Tu Anh Nguyen, Maureen de Seyssel, Patricia Rozé, Morgane Rivière, Eugene Kharitonov, Emmanuel Dupoux

We present the Zero Resource Speech Challenge 2021, which asks participants to learn a language model directly from audio, without any text or labels.

Language Modelling

Paper
Add Code

Tensor Product Decomposition Networks: Uncovering Representations of Structure Learned by Neural Networks

no code implementations • SCiL 2020 • R. Thomas McCoy, Tal Linzen, Ewan Dunbar, Paul Smolensky

Paper
Add Code

Predicting non-native speech perception using the Perceptual Assimilation Model and state-of-the-art acoustic models

no code implementations • CoNLL (EMNLP) 2021 • Juliette Millet, Ioana Chitoran, Ewan Dunbar

Our native language influences the way we perceive speech sounds, affecting our ability to discriminate non-native sounds.

Paper
Add Code

Do self-supervised speech models develop human-like perception biases?

no code implementations • ACL 2022 • Juliette Millet, Ewan Dunbar

We show that the CPC model shows a small native language effect, but that wav2vec 2. 0 and HuBERT seem to develop a universal speech perception space which is not language specific.

Paper
Add Code

Toward a realistic model of speech processing in the brain with self-supervised learning

no code implementations • 3 Jun 2022 • Juliette Millet, Charlotte Caucheteux, Pierre Orhan, Yves Boubenec, Alexandre Gramfort, Ewan Dunbar, Christophe Pallier, Jean-Remi King

These elements, resulting from the largest neuroimaging benchmark to date, show how self-supervised learning can account for a rich organization of speech processing in the brain, and thus delineate a path to identify the laws of language acquisition which shape the human brain.

Language Acquisition Self-Supervised Learning

Paper
Add Code

Are word boundaries useful for unsupervised language learning?

no code implementations • 6 Oct 2022 • Tu Anh Nguyen, Maureen de Seyssel, Robin Algayres, Patricia Roze, Ewan Dunbar, Emmanuel Dupoux

However, word boundary information may be absent or unreliable in the case of speech input (word boundaries are not marked explicitly in the speech stream).

Paper
Add Code

Evaluating context-invariance in unsupervised speech representations

1 code implementation • 27 Oct 2022 • Mark Hallap, Emmanuel Dupoux, Ewan Dunbar

Unsupervised speech representations have taken off, with benchmarks (SUPERB, ZeroSpeech) demonstrating major progress on semi-supervised speech recognition, speech synthesis, and speech-only language modelling.

Language Modelling speech-recognition +2

Paper
Code

Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge

no code implementations • 27 Oct 2022 • Ewan Dunbar, Nicolas Hamilakis, Emmanuel Dupoux

Recent progress in self-supervised or unsupervised machine learning has opened the possibility of building a full speech processing system from raw audio without using any textual representations or expert labels such as phonemes, dictionaries or parse trees.

Acoustic Unit Discovery Language Modelling +1

Paper
Add Code

Bigger is not Always Better: The Effect of Context Size on Speech Pre-Training

1 code implementation • 3 Dec 2023 • Sean Robertson, Ewan Dunbar

It has been generally assumed in the automatic speech recognition (ASR) literature that it is better for models to have access to wider context windows.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.