Speaker Identification

61 papers with code • 4 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Speaker Identification

Dataset	Best Model	Compare
VoxCeleb1	MSM-MAE	See all
EVI en-GB	Fuzzy Retrieval	See all
EVI pl-PL	Fuzzy Retrieval	See all
EVI fr-FR	Fuzzy Retrieval	See all

Datasets

Most implemented papers

Most implemented Social Latest No code

Word-level Embeddings for Cross-Task Transfer Learning in Speech Processing

bepierre/SpeechVGG • • 22 Oct 2019

Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer.

Paper
Code

Generative Pre-Training for Speech with Autoregressive Predictive Coding

iamyuanchung/Autoregressive-Predictive-Coding • • 23 Oct 2019

Learning meaningful and general representations from unannotated speech that are applicable to a wide range of tasks remains challenging.

Paper
Code

Speech2Phone: A Novel and Efficient Method for Training Speaker Recognition Models

Edresson/Speech2Phone • • 25 Feb 2020

We compare the three best architectures trained using our method to select the best one, which is the one with a shallow architecture.

Paper
Code

Contrastive Learning of General-Purpose Audio Representations

google-research/google-research • • 21 Oct 2020

We introduce COLA, a self-supervised pre-training approach for learning a general-purpose representation of audio.

Paper
Code

FoolHD: Fooling speaker identification by Highly imperceptible adversarial Disturbances

fsept11/FoolHD • • 17 Nov 2020

Speaker identification models are vulnerable to carefully designed adversarial perturbations of their input signals that induce misclassification.

Paper
Code

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations

facebookresearch/speech-resynthesis • • 1 Apr 2021

We propose using self-supervised discrete representations for the task of speech resynthesis.

Paper
Code

SSAST: Self-Supervised Audio Spectrogram Transformer

YuanGongND/ssast • • 19 Oct 2021

However, pure Transformer models tend to require more training data compared to CNNs, and the success of the AST relies on supervised pretraining that requires a large amount of labeled data and a complex training pipeline, thus limiting the practical usage of AST.

Paper
Code

PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit

PaddlePaddle/PaddleSpeech • • NAACL (ACL) 2022

PaddleSpeech is an open-source all-in-one speech toolkit.

Paper
Code

A Generative Product-of-Filters Model of Audio

dawenl/pof • 20 Dec 2013

We propose the product-of-filters (PoF) model, a generative model that decomposes audio spectra as sparse linear combinations of "filters" in the log-spectral domain.

Paper
Code

A domain-agnostic approach for opinion prediction on speech

UKPLab/coling-peoples2016-opinion-prediction • WS 2016

We explore a domain-agnostic approach for analyzing speech with the goal of opinion prediction.

Paper
Code

Speaker Identification

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result