Search Results for author: Robin Algayres

Found 13 papers, 3 papers with code

SpiRit-LM: Interleaved Spoken and Written Language Model

no code implementations • 8 Feb 2024 • Tu Anh Nguyen, Benjamin Muller, Bokai Yu, Marta R. Costa-Jussa, Maha Elbayad, Sravya Popuri, Paul-Ambroise Duquenne, Robin Algayres, Ruslan Mavlyutov, Itai Gat, Gabriel Synnaeve, Juan Pino, Benoit Sagot, Emmanuel Dupoux

We introduce SPIRIT-LM, a foundation multimodal language model that freely mixes text and speech.

Language Modelling

Paper
Add Code

XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words

no code implementations • 8 Oct 2023 • Robin Algayres, Pablo Diego-Simon, Benoit Sagot, Emmanuel Dupoux

Due to the absence of explicit word boundaries in the speech stream, the task of segmenting spoken sentences into word units without text supervision is particularly challenging.

Paper
Add Code

Generative Spoken Language Model based on continuous word-sized audio tokens

no code implementations • 8 Oct 2023 • Robin Algayres, Yossi Adi, Tu Anh Nguyen, Jade Copet, Gabriel Synnaeve, Benoit Sagot, Emmanuel Dupoux

In NLP, text language models based on words or subwords are known to outperform their character-based counterparts.

Language Modelling

Paper
Add Code

Big model only for hard audios: Sample dependent Whisper model selection for efficient inferences

1 code implementation • 22 Sep 2023 • Hugo Malard, Salah Zaiem, Robin Algayres

Recent progress in Automatic Speech Recognition (ASR) has been coupled with a substantial increase in the model sizes, which may now contain billions of parameters, leading to slow inferences even with adapted hardware.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study

1 code implementation • 12 Mar 2023 • Salah Zaiem, Robin Algayres, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Self-supervised learning (SSL) has allowed substantial progress in Automatic Speech Recognition (ASR) performance in low-resource settings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Are word boundaries useful for unsupervised language learning?

no code implementations • 6 Oct 2022 • Tu Anh Nguyen, Maureen de Seyssel, Robin Algayres, Patricia Roze, Ewan Dunbar, Emmanuel Dupoux

However, word boundary information may be absent or unreliable in the case of speech input (word boundaries are not marked explicitly in the speech stream).

Paper
Add Code

STOP: A dataset for Spoken Task Oriented Semantic Parsing

1 code implementation • 29 Jun 2022 • Paden Tomasello, Akshat Shrivastava, Daniel Lazar, Po-chun Hsu, Duc Le, Adithya Sagar, Ali Elkahky, Jade Copet, Wei-Ning Hsu, Yossi Adi, Robin Algayres, Tu Ahn Nguyen, Emmanuel Dupoux, Luke Zettlemoyer, Abdelrahman Mohamed

Furthermore, in addition to the human-recorded audio, we are releasing a TTS-generated version to benchmark the performance for low-resource domain adaptation of end-to-end SLU systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

29,203

Paper
Code

DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon

no code implementations • 22 Jun 2022 • Robin Algayres, Tristan Ricoul, Julien Karadayi, Hugo Laurençon, Salah Zaiem, Abdelrahman Mohamed, Benoît Sagot, Emmanuel Dupoux

Finding word boundaries in continuous speech is challenging as there is little or no equivalent of a 'space' delimiter between words.

Language Modelling Segmentation +1

Paper
Add Code

Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning

no code implementations • 11 Apr 2022 • Robin Algayres, Adel Nabli, Benoit Sagot, Emmanuel Dupoux

We introduce a simple neural encoder architecture that can be trained using an unsupervised contrastive learning objective which gets its positive samples from data-augmented k-Nearest Neighbors search.

Contrastive Learning

Paper
Add Code

Generative Spoken Dialogue Language Modeling

no code implementations • 30 Mar 2022 • Tu Anh Nguyen, Eugene Kharitonov, Jade Copet, Yossi Adi, Wei-Ning Hsu, Ali Elkahky, Paden Tomasello, Robin Algayres, Benoit Sagot, Abdelrahman Mohamed, Emmanuel Dupoux

We introduce dGSLM, the first "textless" model able to generate audio samples of naturalistic spoken dialogues.

Language Modelling

Paper
Add Code

The Zero Resource Speech Challenge 2020: Discovering discrete subword and word units

no code implementations • 12 Oct 2020 • Ewan Dunbar, Julien Karadayi, Mathieu Bernard, Xuan-Nga Cao, Robin Algayres, Lucas Ondel, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux

We present the Zero Resource Speech Challenge 2020, which aims at learning speech representations from raw audio signals without any labels.

Speech Synthesis

Paper
Add Code

Evaluating the reliability of acoustic speech embeddings

no code implementations • 27 Jul 2020 • Robin Algayres, Mohamed Salah Zaiem, Benoit Sagot, Emmanuel Dupoux

However, there is currently no clear methodology to compare or optimise the quality of these embeddings in a task-neutral way.

Information Retrieval Retrieval

Paper
Add Code

The Zero Resource Speech Challenge 2019: TTS without T

no code implementations • 25 Apr 2019 • Ewan Dunbar, Robin Algayres, Julien Karadayi, Mathieu Bernard, Juan Benjumea, Xuan-Nga Cao, Lucie Miskic, Charlotte Dugrain, Lucas Ondel, Alan W. black, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux

We present the Zero Resource Speech Challenge 2019, which proposes to build a speech synthesizer without any text or phonetic labels: hence, TTS without T (text-to-speech without text).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.