1 code implementation • 24 Apr 2024 • Ankit Vani, Bac Nguyen, Samuel Lavoie, Ranjay Krishna, Aaron Courville
Using SPARO, we demonstrate improvements on downstream recognition, robustness, retrieval, and compositionality benchmarks with CLIP (up to +14% for ImageNet, +4% for SugarCrepe), and on nearest neighbors and linear probe for ImageNet with DINO (+3% each).
no code implementations • 26 Feb 2024 • Luca Zampierin, Ghouthi Boukli Hacene, Bac Nguyen, Mirco Ravanelli
Self-supervised learning (SSL) has achieved remarkable success across various speech-processing tasks.
1 code implementation • 2 Jun 2023 • Fabian Kögel, Bac Nguyen, Fabien Cardinaux
State-of-the-art non-autoregressive text-to-speech (TTS) models based on FastSpeech 2 can efficiently synthesise high-fidelity and natural speech.
1 code implementation • 23 Apr 2023 • Bac Nguyen, Lukas Mauch
Deep equilibrium models (DEQs) have proven to be very powerful for learning data representations.
no code implementations • 21 Mar 2022 • Bac Nguyen, Fabien Cardinaux, Stefan Uhlich
Using this differentiable duration method, we introduce AutoTTS, a direct text-to-waveform speech synthesis model.
1 code implementation • 18 Mar 2022 • Marie Biolková, Bac Nguyen
Recent works have revealed the vulnerability of automatic speech recognition (ASR) models to adversarial examples (AEs), i. e., small perturbations that cause an error in the transcription of the audio signal.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 2 Jun 2021 • Bac Nguyen, Fabien Cardinaux
By disentangling the speaker identity from the speech content, NVC-Net is able to perform non-parallel traditional many-to-many voice conversion as well as zero-shot voice conversion from a short utterance of an unseen target speaker.
no code implementations • 29 Mar 2021 • Vivek Chalumuri, Bac Nguyen
Given the semantic descriptions of classes, Zero-Shot Learning (ZSL) aims to recognize unseen classes without labeled training data by exploiting semantic information, which contains knowledge between seen and unseen classes.