Search Results for author: Ke Tan

Found 9 papers, 1 papers with code

A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement

no code implementations • 3 Mar 2024 • Ravi Shankar, Ke Tan, Buye Xu, Anurag Kumar

Self-supervised learned models have been found to be very effective for certain speech tasks such as automatic speech recognition, speaker identification, keyword spotting and others.

Automatic Speech Recognition Keyword Spotting +5

Paper
Add Code

TorchAudio-Squim: Reference-less Speech Quality and Intelligibility measures in TorchAudio

no code implementations • 4 Apr 2023 • Anurag Kumar, Ke Tan, Zhaoheng Ni, Pranay Manocha, Xiaohui Zhang, Ethan Henderson, Buye Xu

To enable this, a variety of metrics to measure quality and intelligibility under different assumptions have been developed.

Paper
Add Code

Rethinking complex-valued deep neural networks for monaural speech enhancement

no code implementations • 11 Jan 2023 • Haibin Wu, Ke Tan, Buye Xu, Anurag Kumar, Daniel Wong

By comparing complex- and real-valued versions of fundamental building blocks in the recently developed gated convolutional recurrent network (GCRN), we show how different mechanisms for basic blocks affect the performance.

Open-Ended Question Answering Speech Enhancement

Paper
Add Code

Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral Mapping for Single-channel Speech Enhancement

no code implementations • 16 Nov 2022 • Kuan-Lin Chen, Daniel D. E. Wong, Ke Tan, Buye Xu, Anurag Kumar, Vamsi Krishna Ithapu

During training, our approach augments a model learning complex spectral mapping with a temporary submodel to predict the covariance of the enhancement error at each time-frequency bin.

Speech Enhancement

Paper
Add Code

Location-based training for multi-channel talker-independent speaker separation

no code implementations • 8 Oct 2021 • Hassan Taherian, Ke Tan, DeLiang Wang

We further demonstrate the effectiveness of LBT for the separation of four and five concurrent speakers.

Speaker Separation

Paper
Add Code

SAGRNN: Self-Attentive Gated RNN for Binaural Speaker Separation with Interaural Cue Preservation

1 code implementation • 2 Sep 2020 • Ke Tan, Buye Xu, Anurag Kumar, Eliya Nachmani, Yossi Adi

In addition, our approach effectively preserves the interaural cues, which improves the accuracy of sound localization.

Audio and Speech Processing Sound

Paper
Code

Audio-Visual Speech Separation and Dereverberation with a Two-Stage Multimodal Network

no code implementations • 16 Sep 2019 • Ke Tan, Yong Xu, Shi-Xiong Zhang, Meng Yu, Dong Yu

Background noise, interfering speech and room reverberation frequently distort target speech in real listening environments.

Audio and Speech Processing Sound Signal Processing

Paper
Add Code

Bridging the Gap Between Monaural Speech Enhancement and Recognition with Distortion-Independent Acoustic Modeling

no code implementations • 11 Mar 2019 • Peidong Wang, Ke Tan, DeLiang Wang

In this study, we analyze the distortion problem, compare different acoustic models, and investigate a distortion-independent training scheme for monaural speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective

no code implementations • 22 Nov 2018 • Zhong-Qiu Wang, Ke Tan, DeLiang Wang

This study investigates phase reconstruction for deep learning based monaural talker-independent speaker separation in the short-time Fourier transform (STFT) domain.

Speaker Separation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.