Search Results for author: Mingsian R. Bai

Found 9 papers, 0 papers with code

Spatial-Temporal Activity-Informed Diarization and Separation

no code implementations30 Jan 2024 Yicheng Hsu, Ssuhan Chen, Mingsian R. Bai

The global spatial activity functions are computed from the global spatial coherence functions based on frequency-averaged local spatial activity functions.

speaker-diarization Speaker Diarization +1

Learning-based Array Configuration-Independent Binaural Audio Telepresence with Scalable Signal Enhancement and Ambience Preservation

no code implementations21 Nov 2023 Yicheng Hsu, Mingsian R. Bai

The results have shown that the proposed BAT system can achieve superior telepresence performance with the desired balance between signal enhancement and ambience preservation, even when the array configurations are unseen in the training phase.

Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function

no code implementations19 Oct 2023 HsinYu Chang, Yicheng Hsu, Mingsian R. Bai

Experimental results have shown that the proposed deep beamformer, trained with the linearly weighted scale-invariant source-to-noise ratio (SI-SNR) and ARROW loss functions, achieves superior performance in speech enhancement and speaker localization compared to two baselines.

Speech Enhancement

Array Configuration-Agnostic Personal Voice Activity Detection Based on Spatial Coherence

no code implementations18 Apr 2023 Yicheng Hsu, Mingsian R. Bai

Personal voice activity detection has received increased attention due to the growing popularity of personal mobile devices and smart speakers.

Action Detection Activity Detection +1

Array Configuration-Agnostic Personalized Speech Enhancement using Long-Short-Term Spatial Coherence

no code implementations16 Nov 2022 Yicheng Hsu, Yonghan Lee, Mingsian R. Bai

Personalized speech enhancement has been a field of active research for suppression of speechlike interferers such as competing speakers or TV dialogues.

Speech Enhancement

Model-matching Principle Applied to the Design of an Array-based All-neural Binaural Rendering System for Audio Telepresence

no code implementations20 Oct 2022 Yicheng Hsu, Chenghumg Ma, Mingsian R. Bai

Telepresence aims to create an immersive but virtual experience of the audio and visual scene at the far end for users at the near end.

Multi-channel target speech enhancement based on ERB-scaled spatial coherence features

no code implementations17 Jul 2022 Yicheng Hsu, Yonghan Lee, Mingsian R. Bai

Recently, speech enhancement technologies that are based on deep learning have received considerable research attention.

Speech Enhancement

Multi-channel end-to-end neural network for speech enhancement, source localization, and voice activity detection

no code implementations20 Jun 2022 Yuan Chen, Yicheng Hsu, Mingsian R. Bai

In this study, a neural beamformer consisting of a beamformer and a novel multi-channel DCCRN is proposed for speech enhancement and source localization.

Action Detection Activity Detection +1

Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

no code implementations10 Dec 2021 Yicheng Hsu, Yonghan Lee, Mingsian R. Bai

Furthermore, the proposed enhancement system was compared with a baseline system with speaker embeddings and interchannel phase difference.

Speech Enhancement Speech Extraction

Cannot find the paper you are looking for? You can Submit a new open access paper.