TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Speaker Recognition	VoxCeleb1	w2v2-aam	EER	1.88	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/fine-tuning-wav2vec2-for-speaker-recognition/speaker-recognition-on-voxceleb1)](https://paperswithcode.com/sota/speaker-recognition-on-voxceleb1?p=fine-tuning-wav2vec2-for-speaker-recognition)`

Fine-tuning wav2vec2 for speaker recognition

30 Sep 2021 · Nik Vaessen, David A. van Leeuwen ·

This paper explores applying the wav2vec2 framework to speaker recognition instead of speech recognition. We study the effectiveness of the pre-trained weights on the speaker recognition task, and how to pool the wav2vec2 output sequence into a fixed-length speaker embedding. To adapt the framework to speaker recognition, we propose a single-utterance classification variant with CE or AAM softmax loss, and an utterance-pair classification variant with BCE loss. Our best performing variant, w2v2-aam, achieves a 1.88% EER on the extended voxceleb1 test set compared to 1.69% EER with an ECAPA-TDNN baseline. Code is available at https://github.com/nikvaessen/w2v2-speaker.

PDF Abstract