Search Results for author: Viet Anh Trinh

Found 10 papers, 1 papers with code

Adaptive Endpointing with Deep Contextual Multi-armed Bandits

no code implementations23 Mar 2023 Do June Min, Andreas Stolcke, Anirudh Raju, Colin Vaz, Di He, Venkatesh Ravichandran, Viet Anh Trinh

In this paper, we aim to provide a solution for adaptive endpointing by proposing an efficient method for choosing an optimal endpointing configuration given utterance-level audio features in an online setting, while avoiding hyperparameter grid-search.

Multi-Armed Bandits

ImportantAug: a data augmentation agent for speech

1 code implementation ICASSP 2022 Viet Anh Trinh, Hassan Salami Kavaki, Michael I Mandel

We introduce ImportantAug, a technique to augment training data for speech classification and recognition models by adding noise to unimportant regions of the speech and not to important regions.

 Ranked #1 on Keyword Spotting on Google Speech Commands (Google Speech Command-Musan metric)

Data Augmentation Keyword Spotting +2

Unsupervised Speech Enhancement with speech recognition embedding and disentanglement losses

no code implementations16 Nov 2021 Viet Anh Trinh, Sebastian Braun

Our results show that the proposed function effectively improves the speech enhancement performance compared to a baseline trained in a supervised way on the noisy VoxCeleb dataset.

Disentanglement Speech Enhancement +2

Improved MVDR Beamforming Using LSTM Speech Models to Clean Spatial Clustering Masks

no code implementations2 Dec 2020 Zhaoheng Ni, Felix Grezes, Viet Anh Trinh, Michael I. Mandel

Spatial clustering techniques can achieve significant multi-channel noise reduction across relatively arbitrary microphone configurations, but have difficulty incorporating a detailed speech/noise model.

Clustering

Combining Spatial Clustering with LSTM Speech Models for Multichannel Speech Enhancement

no code implementations2 Dec 2020 Felix Grezes, Zhaoheng Ni, Viet Anh Trinh, Michael Mandel

The system is compared to several baselines on the CHiME3 dataset in terms of speech quality predicted by the PESQ algorithm and word error rate of a recognizer trained on mis-matched conditions, in order to focus on generalization.

Clustering Speech Enhancement

Enhancement of Spatial Clustering-Based Time-Frequency Masks using LSTM Neural Networks

no code implementations2 Dec 2020 Felix Grezes, Zhaoheng Ni, Viet Anh Trinh, Michael Mandel

By using LSTMs to enhance spatial clustering based time-frequency masks, we achieve both the signal modeling performance of multiple single-channel LSTM-DNN speech enhancers and the signal separation performance and generality of multi-channel spatial clustering.

Clustering Speech Enhancement

Large scale evaluation of importance maps in automatic speech recognition

no code implementations21 May 2020 Viet Anh Trinh, Michael I Mandel

In this paper, we propose a metric that we call the structured saliency benchmark (SSBM) to evaluate importance maps computed for automatic speech recognizers on individual utterances.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.