Target Speaker Extraction

Extract the dialogue content of the specified target in a multi-person dialogue.

Target Speaker Verification with Selective Auditory Attention for Single and Multi-talker Speech

xuchenglin28/speaker_extraction 30 Mar 2021

Inspired by the study on target speaker extraction, e. g., SpEx, we propose a unified speaker verification framework for both single- and multi-talker speech, that is able to pay selective auditory attention to the target speaker.

Selective Listening by Synchronizing Speech with Lips

zexupan/reentry 14 Jun 2021

A speaker extraction algorithm seeks to extract the speech of a target speaker from a multi-talker speech mixture when given a cue that represents the target speaker, such as a pre-enrolled speech utterance, or an accompanying video track.

L-SpEx: Localized Target Speaker Extraction

gemengtju/l-spex 21 Feb 2022

Speaker extraction aims to extract the target speaker's voice from a multi-talker speech mixture given an auxiliary reference utterance.

A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction

zexupan/avse_hybrid_loss 31 Mar 2022

We propose a hybrid continuity loss function for time-domain speaker extraction algorithms to settle the over-suppression problem.

ImagineNET: Target Speaker Extraction with Intermittent Visual Cue through Embedding Inpainting

zexupan/imaginenet 31 Oct 2022

In this paper, we study the audio-visual speaker extraction algorithms with intermittent visual cue.

GPU-accelerated Guided Source Separation for Meeting Transcription

desh2608/gss 10 Dec 2022

In this paper, we describe our improved implementation of GSS that leverages the power of modern GPU-based pipelines, including batched processing of frequencies and segments, to provide 300x speed-up over CPU-based inference.