Search Results for author: Buye Xu

Found 14 papers, 4 papers with code

SAGRNN: Self-Attentive Gated RNN for Binaural Speaker Separation with Interaural Cue Preservation

1 code implementation • 2 Sep 2020 • Ke Tan, Buye Xu, Anurag Kumar, Eliya Nachmani, Yossi Adi

In addition, our approach effectively preserves the interaural cues, which improves the accuracy of sound localization.

Audio and Speech Processing Sound

Paper
Code

DPLM: A Deep Perceptual Spatial-Audio Localization Metric

no code implementations • 29 May 2021 • Pranay Manocha, Anurag Kumar, Buye Xu, Anjali Menon, Israel D. Gebru, Vamsi K. Ithapu, Paul Calamia

Subjective evaluations are critical for assessing the perceptual realism of sounds in audio-synthesis driven technologies like augmented and virtual reality.

Audio Synthesis

Paper
Add Code

Online Self-Attentive Gated RNNs for Real-Time Speaker Separation

no code implementations • 25 Jun 2021 • Ori Kabeli, Yossi Adi, Zhenyu Tang, Buye Xu, Anurag Kumar

Our stateful implementation for online separation leads to a minor drop in performance compared to the offline model; 0. 8dB for monaural inputs and 0. 3dB for binaural inputs while reaching a real-time factor of 0. 65.

blind source separation Speaker Separation

Paper
Add Code

Incorporating Real-world Noisy Speech in Neural-network-based Speech Enhancement Systems

no code implementations • 11 Sep 2021 • Yangyang Xia, Buye Xu, Anurag Kumar

Supervised speech enhancement relies on parallel databases of degraded speech signals and their clean reference signals during training.

Speech Enhancement

Paper
Add Code

NORESQA: A Framework for Speech Quality Assessment using Non-Matching References

1 code implementation • NeurIPS 2021 • Pranay Manocha, Buye Xu, Anurag Kumar

We show that neural networks trained using our framework produce scores that correlate well with subjective mean opinion scores (MOS) and are also competitive to methods such as DNSMOS, which explicitly relies on MOS from humans for training networks.

Speech Enhancement

Paper
Code

Continual self-training with bootstrapped remixing for speech enhancement

1 code implementation • 19 Oct 2021 • Efthymios Tzinis, Yossi Adi, Vamsi K. Ithapu, Buye Xu, Anurag Kumar

Specifically, a separation teacher model is pre-trained on an out-of-domain dataset and is used to infer estimated target signals for a batch of in-domain mixtures.

Ranked #13 on Speech Enhancement on Deep Noise Suppression (DNS) Challenge

Speech Enhancement Unsupervised Domain Adaptation

Paper
Code

RemixIT: Continual self-training of speech enhancement models via bootstrapped remixing

2 code implementations • 17 Feb 2022 • Efthymios Tzinis, Yossi Adi, Vamsi Krishna Ithapu, Buye Xu, Paris Smaragdis, Anurag Kumar

RemixIT is based on a continuous self-training scheme in which a pre-trained teacher model on out-of-domain data infers estimated pseudo-target signals for in-domain mixtures.

Ranked #4 on Speech Enhancement on Deep Noise Suppression (DNS) Challenge

Speech Enhancement Unsupervised Domain Adaptation

Paper
Code

SAQAM: Spatial Audio Quality Assessment Metric

no code implementations • 24 Jun 2022 • Pranay Manocha, Anurag Kumar, Buye Xu, Anjali Menon, Israel D. Gebru, Vamsi K. Ithapu, Paul Calamia

Audio quality assessment is critical for assessing the perceptual realism of sounds.

Multi-Task Learning Speech Enhancement

Paper
Add Code

Spatially Selective Active Noise Control Systems

no code implementations • 22 Aug 2022 • Tong Xiao, Buye Xu, Chuming Zhao

In this work, we propose a multi-channel ANC system that only reduces sound from undesired directions, and the system truly preserves the desired sound instead of reproducing it.

Paper
Add Code

Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral Mapping for Single-channel Speech Enhancement

no code implementations • 16 Nov 2022 • Kuan-Lin Chen, Daniel D. E. Wong, Ke Tan, Buye Xu, Anurag Kumar, Vamsi Krishna Ithapu

During training, our approach augments a model learning complex spectral mapping with a temporary submodel to predict the covariance of the enhancement error at each time-frequency bin.

Speech Enhancement

Paper
Add Code

LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders

no code implementations • 20 Nov 2022 • Rodrigo Mira, Buye Xu, Jacob Donley, Anurag Kumar, Stavros Petridis, Vamsi Krishna Ithapu, Maja Pantic

Audio-visual speech enhancement aims to extract clean speech from a noisy environment by leveraging not only the audio itself but also the target speaker's lip movements.

Speech Enhancement Speech Synthesis

Paper
Add Code

Rethinking complex-valued deep neural networks for monaural speech enhancement

no code implementations • 11 Jan 2023 • Haibin Wu, Ke Tan, Buye Xu, Anurag Kumar, Daniel Wong

By comparing complex- and real-valued versions of fundamental building blocks in the recently developed gated convolutional recurrent network (GCRN), we show how different mechanisms for basic blocks affect the performance.

Open-Ended Question Answering Speech Enhancement

Paper
Add Code

TorchAudio-Squim: Reference-less Speech Quality and Intelligibility measures in TorchAudio

no code implementations • 4 Apr 2023 • Anurag Kumar, Ke Tan, Zhaoheng Ni, Pranay Manocha, Xiaohui Zhang, Ethan Henderson, Buye Xu

To enable this, a variety of metrics to measure quality and intelligibility under different assumptions have been developed.

Paper
Add Code

A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement

no code implementations • 3 Mar 2024 • Ravi Shankar, Ke Tan, Buye Xu, Anurag Kumar

Self-supervised learned models have been found to be very effective for certain speech tasks such as automatic speech recognition, speaker identification, keyword spotting and others.

Automatic Speech Recognition Keyword Spotting +5

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.