TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Speaker Diarization	DIHARD II	UIS-RNN-SML	DER(%)	27.3	# 1
Speaker Diarization	DIHARD II	UIS-RNN-SML	DER - no overlap	19.4	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/supervised-online-diarization-with-sample/speaker-diarization-on-dihard-ii)](https://paperswithcode.com/sota/speaker-diarization-on-dihard-ii?p=supervised-online-diarization-with-sample)`

Supervised online diarization with sample mean loss for multi-domain data

4 Nov 2019 · Enrico Fini, Alessio Brutti ·

Recently, a fully supervised speaker diarization approach was proposed (UIS-RNN) which models speakers using multiple instances of a parameter-sharing recurrent neural network. In this paper we propose qualitative modifications to the model that significantly improve the learning efficiency and the overall diarization performance. In particular, we introduce a novel loss function, we called Sample Mean Loss and we present a better modelling of the speaker turn behaviour, by devising an analytical expression to compute the probability of a new speaker joining the conversation. In addition, we demonstrate that our model can be trained on fixed-length speech segments, removing the need for speaker change information in inference. Using x-vectors as input features, we evaluate our proposed approach on the multi-domain dataset employed in the DIHARD II challenge: our online method improves with respect to the original UIS-RNN and achieves similar performance to an offline agglomerative clustering baseline using PLDA scoring.

PDF Abstract