Search Results for author: Dung Tran

Found 8 papers, 2 papers with code

Speaker Selective Beamformer with Keyword Mask Estimation

no code implementations • 25 Oct 2018 • Yusuke Kida, Dung Tran, Motoi Omachi, Toru Taniguchi, Yuya Fujita

The proposed method firstly utilizes a DNN-based mask estimator to separate the mixture signal into the keyword signal uttered by the target speaker and the remaining background speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

An Investigation of End-to-End Multichannel Speech Recognition for Reverberant and Mismatch Conditions

no code implementations • 19 Apr 2019 • Aswin Shanmugam Subramanian, Xiaofei Wang, Shinji Watanabe, Toru Taniguchi, Dung Tran, Yuya Fujita

This report investigates the ability of E2E ASR from standard close-talk to far-field applications by encompassing entire multichannel speech enhancement and ASR components within the S2S model.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features

no code implementations • 8 Dec 2021 • Trung Dang, Dung Tran, Peter Chin, Kazuhito Koishida

Unsupervised Zero-Shot Voice Conversion (VC) aims to modify the speaker characteristic of an utterance to match an unseen target speaker without relying on parallel training data.

Self-Supervised Learning Voice Conversion

Paper
Add Code

Augmented Contrastive Self-Supervised Learning for Audio Invariant Representations

no code implementations • 21 Dec 2021 • Melikasadat Emami, Dung Tran, Kazuhito Koishida

Improving generalization is a major challenge in audio classification due to labeled data scarcity.

Audio Classification Contrastive Learning +1

Paper
Add Code

Corrupting Data to Remove Deceptive Perturbation: Using Preprocessing Method to Improve System Robustness

no code implementations • 5 Jan 2022 • Hieu Le, Hans Walker, Dung Tran, Peter Chin

Although deep neural networks have achieved great performance on classification tasks, recent studies showed that well trained networks can be fooled by adding subtle noises.

Denoising

Paper
Add Code

Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation

1 code implementation • 19 Sep 2023 • Yatong Bai, Trung Dang, Dung Tran, Kazuhito Koishida, Somayeh Sojoudi

Diffusion models power a vast majority of text-to-audio (TTA) generation methods.

Ranked #10 on Audio Generation on AudioCaps

AudioCaps Audio Generation +1

Paper
Code

Learned Image Compression with Text Quality Enhancement

no code implementations • 13 Feb 2024 • Chih-Yu Lai, Dung Tran, Kazuhito Koishida

Learned image compression has gained widespread popularity for their efficiency in achieving ultra-low bit-rates.

Image Compression

Paper
Add Code

uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures

1 code implementation • 14 Mar 2024 • Afrina Tabassum, Dung Tran, Trung Dang, Ismini Lourentzou, Kazuhito Koishida

Masked Autoencoders (MAEs) learn rich low-level representations from unlabeled data but require substantial labeled data to effectively adapt to downstream tasks.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.