Speaker Identification

47 papers with code • 4 benchmarks • 5 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

Speaker Recognition from Raw Waveform with SincNet

mravanelli/SincNet 29 Jul 2018

Rather than employing standard hand-crafted features, the latter CNNs learn low-level speech representations from waveforms, potentially allowing the network to better capture important narrow-band speaker characteristics such as pitch and formants.

Deep Speaker: an End-to-End Neural Speaker Embedding System

philipperemy/deep-speaker 5 May 2017

We present Deep Speaker, a neural speaker embedding system that maps utterances to a hypersphere where speaker similarity is measured by cosine similarity.

AM-MobileNet1D: A Portable Model for Speaker Recognition

joaoantoniocn/AM-MobileNet1D 31 Mar 2020

To address this demand, we propose a portable model called Additive Margin MobileNet1D (AM-MobileNet1D) to Speaker Identification on mobile devices.

AutoSpeech: Neural Architecture Search for Speaker Recognition

TAMU-VITA/AutoSpeech 7 May 2020

Speaker recognition systems based on Convolutional Neural Networks (CNNs) are often built with off-the-shelf backbones such as VGG-Net or ResNet.

Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation

s3prl/s3prl 18 May 2020

We use the representations with two downstream tasks, speaker identification, and phoneme classification.

Learning Speaker Representations with Mutual Information

Js-Mim/rl_singing_voice 1 Dec 2018

Mutual Information (MI) or similar measures of statistical dependence are promising tools for learning these representations in an unsupervised way.

Word-level Embeddings for Cross-Task Transfer Learning in Speech Processing

bepierre/SpeechVGG 22 Oct 2019

Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer.

Generative Pre-Training for Speech with Autoregressive Predictive Coding

iamyuanchung/Autoregressive-Predictive-Coding 23 Oct 2019

Learning meaningful and general representations from unannotated speech that are applicable to a wide range of tasks remains challenging.

Contrastive Learning of General-Purpose Audio Representations

google-research/google-research 21 Oct 2020

We introduce COLA, a self-supervised pre-training approach for learning a general-purpose representation of audio.

FoolHD: Fooling speaker identification by Highly imperceptible adversarial Disturbances

fsept11/FoolHD 17 Nov 2020

Speaker identification models are vulnerable to carefully designed adversarial perturbations of their input signals that induce misclassification.