Text-Independent Speaker Recognition

6 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

Unified Hypersphere Embedding for Speaker Recognition

MahdiHajibabaei/unified-embedding 22 Jul 2018

Incremental improvements in accuracy of Convolutional Neural Networks are usually achieved through use of deeper and more complex models trained on larger datasets.

Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model

Splinter0/CoughCNN 12 Sep 2018

In this paper, we propose a Convolutional Neural Network (CNN) based speaker recognition model for extracting robust speaker embeddings.

Three-Dimensional Lip Motion Network for Text-Independent Speaker Recognition

wutong18/Three-Dimensional-Lip-Motion-Network-for-Text-Independent-Speaker-Recognition 13 Oct 2020

To this end, in this work, we present a novel end-to-end 3D lip motion Network (3LMNet) by utilizing the sentence-level 3D lip motion (S3DLM) to recognize speakers in both the text-independent and text-dependent contexts.

Masked Proxy Loss For Text-Independent Speaker Verification

jlian2/Masked-Proxy-Loss-for-Text-Indepedent-Speaker-Verification 9 Nov 2020

We further propose Multinomial Masked Proxy (MMP) loss to leverage the hardness of speaker pairs.

SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker Verification

wentaozhu/speechnas 18 Sep 2021

Recently, x-vector has been a successful and popular approach for speaker verification, which employs a time delay neural network (TDNN) and statistics pooling to extract speaker characterizing embedding from variable-length utterances.

Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis

shkim816/temporal_dynamic_cnn 7 Oct 2021

The temporal dynamic model adapts itself to phonemes without explicitly given phoneme information during training, and results show the necessity to consider phoneme variation within utterances for more accurate and robust text-independent speaker verification.