Search Results for author: Mingsian R. Bai

Found 9 papers, 0 papers with code

Spatial-Temporal Activity-Informed Diarization and Separation

no code implementations • 30 Jan 2024 • Yicheng Hsu, Ssuhan Chen, Mingsian R. Bai

The global spatial activity functions are computed from the global spatial coherence functions based on frequency-averaged local spatial activity functions.

speaker-diarization Speaker Diarization +1

Paper
Add Code

Learning-based Array Configuration-Independent Binaural Audio Telepresence with Scalable Signal Enhancement and Ambience Preservation

no code implementations • 21 Nov 2023 • Yicheng Hsu, Mingsian R. Bai

The results have shown that the proposed BAT system can achieve superior telepresence performance with the desired balance between signal enhancement and ambience preservation, even when the array configurations are unseen in the training phase.

Paper
Add Code

Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function

no code implementations • 19 Oct 2023 • HsinYu Chang, Yicheng Hsu, Mingsian R. Bai

Experimental results have shown that the proposed deep beamformer, trained with the linearly weighted scale-invariant source-to-noise ratio (SI-SNR) and ARROW loss functions, achieves superior performance in speech enhancement and speaker localization compared to two baselines.

Speech Enhancement

Paper
Add Code

Array Configuration-Agnostic Personal Voice Activity Detection Based on Spatial Coherence

no code implementations • 18 Apr 2023 • Yicheng Hsu, Mingsian R. Bai

Personal voice activity detection has received increased attention due to the growing popularity of personal mobile devices and smart speakers.

Action Detection Activity Detection +1

Paper
Add Code

Array Configuration-Agnostic Personalized Speech Enhancement using Long-Short-Term Spatial Coherence

no code implementations • 16 Nov 2022 • Yicheng Hsu, Yonghan Lee, Mingsian R. Bai

Personalized speech enhancement has been a field of active research for suppression of speechlike interferers such as competing speakers or TV dialogues.

Speech Enhancement

Paper
Add Code

Model-matching Principle Applied to the Design of an Array-based All-neural Binaural Rendering System for Audio Telepresence

no code implementations • 20 Oct 2022 • Yicheng Hsu, Chenghumg Ma, Mingsian R. Bai

Telepresence aims to create an immersive but virtual experience of the audio and visual scene at the far end for users at the near end.

Paper
Add Code

Multi-channel target speech enhancement based on ERB-scaled spatial coherence features

no code implementations • 17 Jul 2022 • Yicheng Hsu, Yonghan Lee, Mingsian R. Bai

Recently, speech enhancement technologies that are based on deep learning have received considerable research attention.

Speech Enhancement

Paper
Add Code

Multi-channel end-to-end neural network for speech enhancement, source localization, and voice activity detection

no code implementations • 20 Jun 2022 • Yuan Chen, Yicheng Hsu, Mingsian R. Bai

In this study, a neural beamformer consisting of a beamformer and a novel multi-channel DCCRN is proposed for speech enhancement and source localization.

Action Detection Activity Detection +1

Paper
Add Code

Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

no code implementations • 10 Dec 2021 • Yicheng Hsu, Yonghan Lee, Mingsian R. Bai

Furthermore, the proposed enhancement system was compared with a baseline system with speaker embeddings and interchannel phase difference.

Speech Enhancement Speech Extraction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.