Search Results for author: Buye Xu

Found 12 papers, 4 papers with code

Rethinking complex-valued deep neural networks for monaural speech enhancement

no code implementations11 Jan 2023 Haibin Wu, Ke Tan, Buye Xu, Anurag Kumar, Daniel Wong

By comparing complex- and real-valued versions of fundamental building blocks in the recently developed gated convolutional recurrent network (GCRN), we show how different mechanisms for basic blocks affect the performance.

Speech Enhancement

LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders

no code implementations20 Nov 2022 Rodrigo Mira, Buye Xu, Jacob Donley, Anurag Kumar, Stavros Petridis, Vamsi Krishna Ithapu, Maja Pantic

Audio-visual speech enhancement aims to extract clean speech from a noisy environment by leveraging not only the audio itself but also the target speaker's lip movements.

Speech Enhancement Speech Synthesis

Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral Mapping for Single-channel Speech Enhancement

no code implementations16 Nov 2022 Kuan-Lin Chen, Daniel D. E. Wong, Ke Tan, Buye Xu, Anurag Kumar, Vamsi Krishna Ithapu

During training, our approach augments a model learning complex spectral mapping with a temporary submodel to predict the covariance of the enhancement error at each time-frequency bin.

Speech Enhancement

Spatially Selective Active Noise Control Systems

no code implementations22 Aug 2022 Tong Xiao, Buye Xu, Chuming Zhao

We simulated the proposed algorithm based on a microphone array on a pair of augmented eye-glasses and compared it with the existing methods in the literature.

RemixIT: Continual self-training of speech enhancement models via bootstrapped remixing

1 code implementation17 Feb 2022 Efthymios Tzinis, Yossi Adi, Vamsi Krishna Ithapu, Buye Xu, Paris Smaragdis, Anurag Kumar

RemixIT is based on a continuous self-training scheme in which a pre-trained teacher model on out-of-domain data infers estimated pseudo-target signals for in-domain mixtures.

Speech Enhancement Unsupervised Domain Adaptation

Continual self-training with bootstrapped remixing for speech enhancement

1 code implementation19 Oct 2021 Efthymios Tzinis, Yossi Adi, Vamsi K. Ithapu, Buye Xu, Anurag Kumar

Specifically, a separation teacher model is pre-trained on an out-of-domain dataset and is used to infer estimated target signals for a batch of in-domain mixtures.

Speech Enhancement Unsupervised Domain Adaptation

NORESQA: A Framework for Speech Quality Assessment using Non-Matching References

1 code implementation NeurIPS 2021 Pranay Manocha, Buye Xu, Anurag Kumar

We show that neural networks trained using our framework produce scores that correlate well with subjective mean opinion scores (MOS) and are also competitive to methods such as DNSMOS, which explicitly relies on MOS from humans for training networks.

Speech Enhancement

Incorporating Real-world Noisy Speech in Neural-network-based Speech Enhancement Systems

no code implementations11 Sep 2021 Yangyang Xia, Buye Xu, Anurag Kumar

Supervised speech enhancement relies on parallel databases of degraded speech signals and their clean reference signals during training.

Speech Enhancement

Online Self-Attentive Gated RNNs for Real-Time Speaker Separation

no code implementations25 Jun 2021 Ori Kabeli, Yossi Adi, Zhenyu Tang, Buye Xu, Anurag Kumar

Our stateful implementation for online separation leads to a minor drop in performance compared to the offline model; 0. 8dB for monaural inputs and 0. 3dB for binaural inputs while reaching a real-time factor of 0. 65.

Speaker Separation

DPLM: A Deep Perceptual Spatial-Audio Localization Metric

no code implementations29 May 2021 Pranay Manocha, Anurag Kumar, Buye Xu, Anjali Menon, Israel D. Gebru, Vamsi K. Ithapu, Paul Calamia

Subjective evaluations are critical for assessing the perceptual realism of sounds in audio-synthesis driven technologies like augmented and virtual reality.

SAGRNN: Self-Attentive Gated RNN for Binaural Speaker Separation with Interaural Cue Preservation

1 code implementation2 Sep 2020 Ke Tan, Buye Xu, Anurag Kumar, Eliya Nachmani, Yossi Adi

In addition, our approach effectively preserves the interaural cues, which improves the accuracy of sound localization.

Audio and Speech Processing Sound

Cannot find the paper you are looking for? You can Submit a new open access paper.