Target Speaker Extraction
6 papers with code • 0 benchmarks • 0 datasets
Extract the dialogue content of the specified target in a multi-person dialogue.
Benchmarks
These leaderboards are used to track progress in Target Speaker Extraction
Most implemented papers
Target Speaker Verification with Selective Auditory Attention for Single and Multi-talker Speech
Inspired by the study on target speaker extraction, e. g., SpEx, we propose a unified speaker verification framework for both single- and multi-talker speech, that is able to pay selective auditory attention to the target speaker.
Selective Listening by Synchronizing Speech with Lips
A speaker extraction algorithm seeks to extract the speech of a target speaker from a multi-talker speech mixture when given a cue that represents the target speaker, such as a pre-enrolled speech utterance, or an accompanying video track.
L-SpEx: Localized Target Speaker Extraction
Speaker extraction aims to extract the target speaker's voice from a multi-talker speech mixture given an auxiliary reference utterance.
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
We propose a hybrid continuity loss function for time-domain speaker extraction algorithms to settle the over-suppression problem.
ImagineNET: Target Speaker Extraction with Intermittent Visual Cue through Embedding Inpainting
In this paper, we study the audio-visual speaker extraction algorithms with intermittent visual cue.
GPU-accelerated Guided Source Separation for Meeting Transcription
In this paper, we describe our improved implementation of GSS that leverages the power of modern GPU-based pipelines, including batched processing of frequencies and segments, to provide 300x speed-up over CPU-based inference.