Search Results for author: Dung Tran

Found 8 papers, 2 papers with code

Speaker Selective Beamformer with Keyword Mask Estimation

no code implementations25 Oct 2018 Yusuke Kida, Dung Tran, Motoi Omachi, Toru Taniguchi, Yuya Fujita

The proposed method firstly utilizes a DNN-based mask estimator to separate the mixture signal into the keyword signal uttered by the target speaker and the remaining background speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

An Investigation of End-to-End Multichannel Speech Recognition for Reverberant and Mismatch Conditions

no code implementations19 Apr 2019 Aswin Shanmugam Subramanian, Xiaofei Wang, Shinji Watanabe, Toru Taniguchi, Dung Tran, Yuya Fujita

This report investigates the ability of E2E ASR from standard close-talk to far-field applications by encompassing entire multichannel speech enhancement and ASR components within the S2S model.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features

no code implementations8 Dec 2021 Trung Dang, Dung Tran, Peter Chin, Kazuhito Koishida

Unsupervised Zero-Shot Voice Conversion (VC) aims to modify the speaker characteristic of an utterance to match an unseen target speaker without relying on parallel training data.

Self-Supervised Learning Voice Conversion

Corrupting Data to Remove Deceptive Perturbation: Using Preprocessing Method to Improve System Robustness

no code implementations5 Jan 2022 Hieu Le, Hans Walker, Dung Tran, Peter Chin

Although deep neural networks have achieved great performance on classification tasks, recent studies showed that well trained networks can be fooled by adding subtle noises.

Denoising

Learned Image Compression with Text Quality Enhancement

no code implementations13 Feb 2024 Chih-Yu Lai, Dung Tran, Kazuhito Koishida

Learned image compression has gained widespread popularity for their efficiency in achieving ultra-low bit-rates.

Image Compression

uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures

1 code implementation14 Mar 2024 Afrina Tabassum, Dung Tran, Trung Dang, Ismini Lourentzou, Kazuhito Koishida

Masked Autoencoders (MAEs) learn rich low-level representations from unlabeled data but require substantial labeled data to effectively adapt to downstream tasks.

Cannot find the paper you are looking for? You can Submit a new open access paper.