no code implementations • 12 Dec 2017 • Ewan Dunbar, Xuan Nga Cao, Juan Benjumea, Julien Karadayi, Mathieu Bernard, Laurent Besacier, Xavier Anguera, Emmanuel Dupoux
We describe a new challenge aimed at discovering subword and word units from raw speech.
2 code implementations • 30 Apr 2018 • Rachid Riad, Corentin Dancette, Julien Karadayi, Neil Zeghidour, Thomas Schatz, Emmanuel Dupoux
We apply these results to pairs of words discovered using an unsupervised algorithm and show an improvement on state-of-the-art in unsupervised representation learning using siamese networks.
no code implementations • 25 Apr 2019 • Ewan Dunbar, Robin Algayres, Julien Karadayi, Mathieu Bernard, Juan Benjumea, Xuan-Nga Cao, Lucie Miskic, Charlotte Dugrain, Lucas Ondel, Alan W. black, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux
We present the Zero Resource Speech Challenge 2019, which proposes to build a speech synthesizer without any text or phonetic labels: hence, TTS without T (text-to-speech without text).
2 code implementations • 17 Dec 2019 • Jacob Kahn, Morgane Rivière, Weiyi Zheng, Evgeny Kharitonov, Qiantong Xu, Pierre-Emmanuel Mazaré, Julien Karadayi, Vitaliy Liptchinsky, Ronan Collobert, Christian Fuegen, Tatiana Likhomanenko, Gabriel Synnaeve, Armand Joulin, Abdel-rahman Mohamed, Emmanuel Dupoux
Additionally, we provide baseline systems and evaluation metrics working under three settings: (1) the zero resource/unsupervised setting (ABX), (2) the semi-supervised setting (PER, CER) and (3) the distant supervision setting (WER).
Ranked #1 on Speech Recognition on Libri-Light test-other (ABX-within metric)
no code implementations • 12 Oct 2020 • Ewan Dunbar, Julien Karadayi, Mathieu Bernard, Xuan-Nga Cao, Robin Algayres, Lucas Ondel, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux
We present the Zero Resource Speech Challenge 2020, which aims at learning speech representations from raw audio signals without any labels.
1 code implementation • 12 Mar 2021 • Rachid Riad, Julien Karadayi, Anne-Catherine Bachoud-Lévi, Emmanuel Dupoux
We found out that models based on Learnable STRFs are on par for all tasks with different toplines, and obtain the best performance for Speech Activity Detection.
1 code implementation • 10 Dec 2021 • Mathieu Bernard, Maxime Poli, Julien Karadayi, Emmanuel Dupoux
After describing the Shennong software architecture, its core components and implemented algorithms, this paper illustrates its use on three applications: a comparison of speech features performances on a phones discrimination task, an analysis of a Vocal Tract Length Normalization model as a function of the speech duration used for training and a comparison of pitch estimation algorithms under various noise conditions.
no code implementations • 22 Jun 2022 • Robin Algayres, Tristan Ricoul, Julien Karadayi, Hugo Laurençon, Salah Zaiem, Abdelrahman Mohamed, Benoît Sagot, Emmanuel Dupoux
Finding word boundaries in continuous speech is challenging as there is little or no equivalent of a 'space' delimiter between words.