Speaker Identification

61 papers with code • 4 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Speaker Identification

Dataset	Best Model	Compare
VoxCeleb1	MSM-MAE	See all
EVI en-GB	Fuzzy Retrieval	See all
EVI pl-PL	Fuzzy Retrieval	See all
EVI fr-FR	Fuzzy Retrieval	See all

Datasets

Most implemented papers

Most implemented Social Latest No code

On Learning Associations of Faces and Voices

changil/facevoice • • 15 May 2018

We computationally model the overlapping information between faces and voices and show that the learned cross-modal representation contains enough information to identify matching faces and voices with performance similar to that of humans.

Paper
Code

Latent space representation for multi-target speaker detection and identification with a sparse dataset using Triplet neural networks

KinWaiCheuk/MCE2018 • • 1 Oct 2019

When reducing the training data to only using the train set, our method results in 309 confusions for the Multi-target speaker identification task, which is 46% better than the baseline model.

Paper
Code

Delving into VoxCeleb: environment invariant speaker recognition

theolepage/sslsv • • 24 Oct 2019

Research in speaker recognition has recently seen significant progress due to the application of neural network models and the availability of new large-scale datasets.

Paper
Code

Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam

jyhan03/channel-decorrelation • • 23 Jan 2020

First, we propose a time-domain implementation of SpeakerBeam similar to that proposed for a time-domain audio separation network (TasNet), which has achieved state-of-the-art performance for speech separation.

Paper
Code

Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs

seongmin-kye/meta-SR • • 6 Apr 2020

By combining these two learning schemes, our model outperforms existing state-of-the-art speaker verification models learned with a standard supervised learning framework on short utterance (1-2 seconds) on the VoxCeleb datasets.

Paper
Code

Identify Speakers in Cocktail Parties with End-to-End Attention

JunzheJosephZhu/Identify-Speakers-in-Cocktail-Parties-with-E2E-Attention • • 22 May 2020

In scenarios where multiple speakers talk at the same time, it is important to be able to identify the talkers accurately.

Paper
Code

audino: A Modern Annotation Tool for Audio and Speech

midas-research/audino • 9 Jun 2020

The tool allows audio data and their corresponding annotations to be uploaded and assigned to a user through a key-based API.

Paper
Code

Investigation of End-To-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings

NaoyukiKanda/LibriSpeechMix • 11 Aug 2020

However, the model required prior knowledge of speaker profiles to perform speaker identification, which significantly limited the application of the model.

Paper
Code

Sum-Product Networks for Robust Automatic Speaker Identification

anicolson/SPN-ASI • • 13 Aug 2020

Though current SPN toolkits and learning algorithms are in their infancy, we aim to show that SPNs have the potential to become a useful tool for robust speech processing in the future.

Paper
Code

Compositional embedding models for speaker identification and diarization with simultaneous speech from 2+ speakers

lizeqian/Compositional-embedding-for-speaker-diarization • • 22 Oct 2020

We propose a new method for speaker diarization that can handle overlapping speech with 2+ people.

Paper
Code

Speaker Identification

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result