Speaker Verification

170 papers with code • 5 benchmarks • 6 datasets

Speaker verification is the verifying the identity of a person from characteristics of the voice.

( Image credit: Contrastive-Predictive-Coding-PyTorch )

Benchmarks

Add a Result

These leaderboards are used to track progress in Speaker Verification

Dataset	Best Model	Compare
VoxCeleb	WavLM+ECAPA-TDNN	See all
CN-CELEB	X-Vectors with Attention Backend	See all
CALLHOME	GE2E	See all
VoxCeleb1	SpeechNAS	See all
VoxCeleb2	ResNet-50	See all

Libraries

Use these libraries to find Speaker Verification models and implementations

PaddlePaddle/PaddleSpeech

5 papers

10,151

alibaba-damo-academy/3D-Speaker

4 papers

711

Jungjee/RawNet

4 papers

332

CorentinJ/Real-Time-Voice-Cloning

2 papers

50,752

See all 9 libraries.

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

Generalized End-to-End Loss for Speaker Verification

CorentinJ/Real-Time-Voice-Cloning • • 28 Oct 2017

In this paper, we propose a new loss function called generalized end-to-end (GE2E) loss, which makes the training of speaker verification models more efficient than our previous tuple-based end-to-end (TE2E) loss function.

Paper
Code

Speaker Recognition from Raw Waveform with SincNet

mravanelli/SincNet • • 29 Jul 2018

Rather than employing standard hand-crafted features, the latter CNNs learn low-level speech representations from waveforms, potentially allowing the network to better capture important narrow-band speaker characteristics such as pitch and formants.

Paper
Code

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis

CorentinJ/Real-Time-Voice-Cloning • • NeurIPS 2018

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Paper
Code

ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification

PaddlePaddle/PaddleSpeech • • 10 Aug 2020

The successful x-vector architecture is a Time Delay Neural Network (TDNN) that applies statistics pooling to project variable-length utterances into fixed-length speaker characterizing embeddings.

Paper
Code

Text-Independent Speaker Verification Using 3D Convolutional Neural Networks

astorfi/3D-convolutional-speaker-recognition • • 26 May 2017

In our paper, we propose an adaptive feature learning by utilizing the 3D-CNNs for direct speaker model creation in which, for both development and enrollment phases, an identical number of spoken utterances per speaker is fed to the network for representing the speakers' utterances and creation of the speaker model.

Paper
Code

An Unsupervised Autoregressive Model for Speech Representation Learning

iamyuanchung/Autoregressive-Predictive-Coding • • 5 Apr 2019

This paper proposes a novel unsupervised autoregressive neural model for learning generic speech representations.

Paper
Code

Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised Learning

s3prl/s3prl • • 5 Jun 2020

To explore this issue, we proposed to employ Mockingjay, a self-supervised learning based model, to protect anti-spoofing models against adversarial attacks in the black-box scenario.

Paper
Code

Speaker Diarization with LSTM

wq2012/SpectralCluster • 28 Oct 2017

For many years, i-vector based audio embedding techniques were the dominant approach for speaker verification and speaker diarization applications.

Paper
Code

ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection

lixucuhk/ASV-anti-spoofing-with-Res2Net • • 14 Apr 2019

ASVspoof, now in its third edition, is a series of community-led challenges which promote the development of countermeasures to protect automatic speaker verification (ASV) from the threat of spoofing.

Paper
Code

RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification

Jungjee/RawNet • • 17 Apr 2019

In this study, we explore end-to-end deep neural networks that input raw waveforms to improve various aspects: front-end speaker embedding extraction including model architecture, pre-training scheme, additional objective functions, and back-end classification.

Paper
Code

Speaker Verification

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result