Search Results for author: Bac Nguyen

Found 8 papers, 5 papers with code

SPARO: Selective Attention for Robust and Compositional Transformer Encodings for Vision

1 code implementation • 24 Apr 2024 • Ankit Vani, Bac Nguyen, Samuel Lavoie, Ranjay Krishna, Aaron Courville

Using SPARO, we demonstrate improvements on downstream recognition, robustness, retrieval, and compositionality benchmarks with CLIP (up to +14% for ImageNet, +4% for SugarCrepe), and on nearest neighbors and linear probe for ImageNet with DINO (+3% each).

Paper
Code

SKILL: Similarity-aware Knowledge distILLation for Speech Self-Supervised Learning

no code implementations • 26 Feb 2024 • Luca Zampierin, Ghouthi Boukli Hacene, Bac Nguyen, Mirco Ravanelli

Self-supervised learning (SSL) has achieved remarkable success across various speech-processing tasks.

Knowledge Distillation Self-Supervised Learning

Paper
Add Code

Towards Robust FastSpeech 2 by Modelling Residual Multimodality

1 code implementation • 2 Jun 2023 • Fabian Kögel, Bac Nguyen, Fabien Cardinaux

State-of-the-art non-autoregressive text-to-speech (TTS) models based on FastSpeech 2 can efficiently synthesise high-fidelity and natural speech.

337

Paper
Code

Efficient Training of Deep Equilibrium Models

1 code implementation • 23 Apr 2023 • Bac Nguyen, Lukas Mauch

Deep equilibrium models (DEQs) have proven to be very powerful for learning data representations.

704

Paper
Code

AutoTTS: End-to-End Text-to-Speech Synthesis through Differentiable Duration Modeling

no code implementations • 21 Mar 2022 • Bac Nguyen, Fabien Cardinaux, Stefan Uhlich

Using this differentiable duration method, we introduce AutoTTS, a direct text-to-waveform speech synthesis model.

Speech Synthesis Text-To-Speech Synthesis

Paper
Add Code

Neural Predictor for Black-Box Adversarial Attacks on Speech Recognition

1 code implementation • 18 Mar 2022 • Marie Biolková, Bac Nguyen

Recent works have revealed the vulnerability of automatic speech recognition (ASR) models to adversarial examples (AEs), i. e., small perturbations that cause an error in the transcription of the audio signal.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

NVC-Net: End-to-End Adversarial Voice Conversion

1 code implementation • 2 Jun 2021 • Bac Nguyen, Fabien Cardinaux

By disentangling the speaker identity from the speech content, NVC-Net is able to perform non-parallel traditional many-to-many voice conversion as well as zero-shot voice conversion from a short utterance of an unseen target speaker.

Speech Synthesis Voice Conversion

Paper
Code

A Simple Approach for Zero-Shot Learning based on Triplet Distribution Embeddings

no code implementations • 29 Mar 2021 • Vivek Chalumuri, Bac Nguyen

Given the semantic descriptions of classes, Zero-Shot Learning (ZSL) aims to recognize unseen classes without labeled training data by exploiting semantic information, which contains knowledge between seen and unseen classes.

Generalized Zero-Shot Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.