TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Speaker Diarization	AMI Lapel	TitaNet-L (NME-SC)	DER(%)	2.03	# 3
Speaker Diarization	AMI Lapel	ECAPA (SC)	DER(%)	2.36	# 4
Speaker Diarization	AMI Lapel	TitaNet-S (NME-SC)	DER(%)	2.00	# 2
Speaker Diarization	AMI Lapel	TitaNet-M (NME-SC)	DER(%)	1.99	# 1
Speaker Diarization	AMI MixHeadset	TitaNet-M (NME-SC)	DER(%)	1.79	# 3
Speaker Diarization	AMI MixHeadset	TitaNet-S (NME-SC)	DER(%)	2.22	# 4
Speaker Diarization	AMI MixHeadset	ECAPA (SC)	DER(%)	1.78	# 2
Speaker Diarization	AMI MixHeadset	TitaNet-L (NME-SC)	DER(%)	1.73	# 1
Speaker Diarization	CALLHOME-109	titanet-s	DER(%)	1.11	# 1
Speaker Diarization	CH109	TitaNet-M (NME-SC)	DER(%)	1.13	# 2
Speaker Diarization	CH109	TitaNet-L (NME-SC)	DER(%)	1.19	# 3
Speaker Diarization	CH109	x-vector (PLDA + AHC)	DER(%)	9.72	# 4
Speaker Diarization	CH109	TitaNet-S (NME-SC)	DER(%)	1.11	# 1
Speaker Diarization	NIST-SRE 2000	x-vector (PLDA + AHC)	DER(%)	8.39	# 5
Speaker Diarization	NIST-SRE 2000	TitaNet-S (NME-SC)	DER(%)	6.37	# 2
Speaker Diarization	NIST-SRE 2000	TitaNet-L (NME-SC)	DER(%)	6.73	# 4
Speaker Diarization	NIST-SRE 2000	x-vector (MCGAN)	DER(%)	5.73	# 1
Speaker Diarization	NIST-SRE 2000	TitaNet-M (NME-SC)	DER(%)	6.47	# 3
Speaker Verification	VoxCeleb	TitanNet -S	EER	1.15	# 5
Speaker Verification	VoxCeleb	TitanNet -L	EER	0.68	# 2
Speaker Verification	VoxCeleb	TitanNet -M	EER	0.81	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/titanet-neural-model-for-speaker/speaker-diarization-on-ami-lapel)](https://paperswithcode.com/sota/speaker-diarization-on-ami-lapel?p=titanet-neural-model-for-speaker)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/titanet-neural-model-for-speaker/speaker-diarization-on-ami-mixheadset)](https://paperswithcode.com/sota/speaker-diarization-on-ami-mixheadset?p=titanet-neural-model-for-speaker)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/titanet-neural-model-for-speaker/speaker-diarization-on-callhome-109)](https://paperswithcode.com/sota/speaker-diarization-on-callhome-109?p=titanet-neural-model-for-speaker)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/titanet-neural-model-for-speaker/speaker-diarization-on-ch109)](https://paperswithcode.com/sota/speaker-diarization-on-ch109?p=titanet-neural-model-for-speaker)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/titanet-neural-model-for-speaker/speaker-diarization-on-nist-sre-2000)](https://paperswithcode.com/sota/speaker-diarization-on-nist-sre-2000?p=titanet-neural-model-for-speaker)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/titanet-neural-model-for-speaker/speaker-verification-on-voxceleb)](https://paperswithcode.com/sota/speaker-verification-on-voxceleb?p=titanet-neural-model-for-speaker)`

TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context

8 Oct 2021 · Nithin Rao Koluguri, Taejin Park, Boris Ginsburg ·

In this paper, we propose TitaNet, a novel neural network architecture for extracting speaker representations. We employ 1D depth-wise separable convolutions with Squeeze-and-Excitation (SE) layers with global context followed by channel attention based statistics pooling layer to map variable-length utterances to a fixed-length embedding (t-vector). TitaNet is a scalable architecture and achieves state-of-the-art performance on speaker verification task with an equal error rate (EER) of 0.68% on the VoxCeleb1 trial file and also on speaker diarization tasks with diarization error rate (DER) of 1.73% on AMI-MixHeadset, 1.99% on AMI-Lapel and 1.11% on CH109. Furthermore, we investigate various sizes of TitaNet and present a light TitaNet-S model with only 6M parameters that achieve near state-of-the-art results in diarization tasks.

PDF Abstract

Code

Add Remove Mark official

NVIDIA/NeMo official

10,045

Wadaboa/titanet

Tasks

Add Remove

speaker-diarization

Speaker Diarization

Speaker Verification

Datasets

LibriSpeech

VoxCeleb1

VoxCeleb2 CALLHOME American English Speech

Results from the Paper

Add Remove

Ranked #1 on Speaker Diarization on CALLHOME-109

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Speaker Diarization	AMI Lapel	TitaNet-L (NME-SC)	DER(%)	2.03	# 3	Compare
Speaker Diarization	AMI Lapel	ECAPA (SC)	DER(%)	2.36	# 4	Compare
Speaker Diarization	AMI Lapel	TitaNet-S (NME-SC)	DER(%)	2.00	# 2	Compare
Speaker Diarization	AMI Lapel	TitaNet-M (NME-SC)	DER(%)	1.99	# 1	Compare
Speaker Diarization	AMI MixHeadset	TitaNet-M (NME-SC)	DER(%)	1.79	# 3	Compare
Speaker Diarization	AMI MixHeadset	TitaNet-S (NME-SC)	DER(%)	2.22	# 4	Compare
Speaker Diarization	AMI MixHeadset	ECAPA (SC)	DER(%)	1.78	# 2	Compare
Speaker Diarization	AMI MixHeadset	TitaNet-L (NME-SC)	DER(%)	1.73	# 1	Compare
Speaker Diarization	CALLHOME-109	titanet-s	DER(%)	1.11	# 1	Compare
Speaker Diarization	CH109	TitaNet-M (NME-SC)	DER(%)	1.13	# 2	Compare
Speaker Diarization	CH109	TitaNet-L (NME-SC)	DER(%)	1.19	# 3	Compare
Speaker Diarization	CH109	x-vector (PLDA + AHC)	DER(%)	9.72	# 4	Compare
Speaker Diarization	CH109	TitaNet-S (NME-SC)	DER(%)	1.11	# 1	Compare
Speaker Diarization	NIST-SRE 2000	x-vector (PLDA + AHC)	DER(%)	8.39	# 5	Compare
Speaker Diarization	NIST-SRE 2000	TitaNet-S (NME-SC)	DER(%)	6.37	# 2	Compare
Speaker Diarization	NIST-SRE 2000	TitaNet-L (NME-SC)	DER(%)	6.73	# 4	Compare
Speaker Diarization	NIST-SRE 2000	x-vector (MCGAN)	DER(%)	5.73	# 1	Compare
Speaker Diarization	NIST-SRE 2000	TitaNet-M (NME-SC)	DER(%)	6.47	# 3	Compare
Speaker Verification	VoxCeleb	TitanNet -S	EER	1.15	# 5	Compare
Speaker Verification	VoxCeleb	TitanNet -L	EER	0.68	# 2	Compare
Speaker Verification	VoxCeleb	TitanNet -M	EER	0.81	# 3	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove