Speaker Recognition

90 papers with code • 1 benchmarks • 6 datasets

Speaker Recognition is the process of identifying or confirming the identity of a person given his speech segments.

Source: Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Libraries

Use these libraries to find Speaker Recognition models and implementations

3D-Speaker-Toolkit: An Open Source Toolkit for Multi-modal Speaker Verification and Diarization

alibaba-damo-academy/3D-Speaker 29 Mar 2024

This paper introduces 3D-Speaker-Toolkit, an open source toolkit for multi-modal speaker verification and diarization.

709
29 Mar 2024

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

espnet/espnet 30 Jan 2024

First, we provide an open-source platform for researchers in the speaker recognition community to effortlessly build models.

7,871
30 Jan 2024

Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews

bandas-center/atrain 18 Oct 2023

If an entry-level graphics card is available, the transcription speed increases to 20% of the audio duration.

95
18 Oct 2023

Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition

wenet-e2e/wespeaker 21 Sep 2023

Current speaker recognition systems primarily rely on supervised approaches, constrained by the scale of labeled datasets.

534
21 Sep 2023

SLMIA-SR: Speaker-Level Membership Inference Attacks against Speaker Recognition Systems

s3l-official/slmia-sr 14 Sep 2023

Our attack is versatile and can work in both white-box and black-box scenarios.

4
14 Sep 2023

SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?

ashi-ta/speechglue 14 Jun 2023

Self-supervised learning (SSL) for speech representation has been successfully applied in various downstream tasks, such as speech and speaker recognition.

13
14 Jun 2023

Can Self-Supervised Neural Representations Pre-Trained on Human Speech distinguish Animal Callers?

idiap/ssl-caller-detection 23 May 2023

Self-supervised learning (SSL) models use only the intrinsic structure of a given signal, independent of its acoustic domain, to extract essential information from the input to an embedding space.

6
23 May 2023

Vocal Style Factorization for Effective Speaker Recognition in Affective Scenarios

morganlee123/evector 13 May 2023

The accuracy of automated speaker recognition is negatively impacted by change in emotions in a person's speech.

2
13 May 2023

VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge

jaesunghuh/voxsrc2022 20 Feb 2023

This paper summarises the findings from the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22), which was held in conjunction with INTERSPEECH 2022.

17
20 Feb 2023

Probabilistic Back-ends for Online Speaker Recognition and Clustering

sholokhovalexey/online-speaker-clustering 19 Feb 2023

This paper focuses on multi-enrollment speaker recognition which naturally occurs in the task of online speaker clustering, and studies the properties of different scoring back-ends in this scenario.

8
19 Feb 2023