TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Speech Recognition	Libri-Light test-clean	S6000h-n42-τ2 → 0.1	ABX-within	9.33	# 2
Speech Recognition	Libri-Light test-clean	S6000h-n42-τ2 → 0.1	ABX-across	13.53	# 2
Speech Recognition	Libri-Light test-other	S6000h-n42-τ2 → 0.1	ABX-within	12.05	# 2
Speech Recognition	Libri-Light test-other	S6000h-n42-τ2 → 0.1	ABX-across	20.6	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/improving-unsupervised-sparsespeech-acoustic/speech-recognition-on-libri-light-test-clean)](https://paperswithcode.com/sota/speech-recognition-on-libri-light-test-clean?p=improving-unsupervised-sparsespeech-acoustic)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/improving-unsupervised-sparsespeech-acoustic/speech-recognition-on-libri-light-test-other)](https://paperswithcode.com/sota/speech-recognition-on-libri-light-test-other?p=improving-unsupervised-sparsespeech-acoustic)`

Improving Unsupervised Sparsespeech Acoustic Models with Categorical Reparameterization

29 May 2020 · Benjamin Milde, Chris Biemann ·

The Sparsespeech model is an unsupervised acoustic model that can generate discrete pseudo-labels for untranscribed speech. We extend the Sparsespeech model to allow for sampling over a random discrete variable, yielding pseudo-posteriorgrams. The degree of sparsity in this posteriorgram can be fully controlled after the model has been trained. We use the Gumbel-Softmax trick to approximately sample from a discrete distribution in the neural network and this allows us to train the network efficiently with standard backpropagation. The new and improved model is trained and evaluated on the Libri-Light corpus, a benchmark for ASR with limited or no supervision. The model is trained on 600h and 6000h of English read speech. We evaluate the improved model using the ABX error measure and a semi-supervised setting with 10h of transcribed speech. We observe a relative improvement of up to 31.4% on ABX error rates across speakers on the test set with the improved Sparsespeech model on 600h of speech data and further improvements when we scale the model to 6000h.

PDF Abstract

Code

Add Remove Mark official

milde/sparsespeech official

Tasks

Add Remove

Speech Recognition

Datasets

LibriSpeech Libri-Light

Results from the Paper

Edit

Ranked #2 on Speech Recognition on Libri-Light test-other (ABX-within metric)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Speech Recognition	Libri-Light test-clean	S6000h-n42-τ2 → 0.1	ABX-within	9.33	# 2	Compare
Speech Recognition	Libri-Light test-clean	S6000h-n42-τ2 → 0.1	ABX-across	13.53	# 2	Compare
Speech Recognition	Libri-Light test-other	S6000h-n42-τ2 → 0.1	ABX-within	12.05	# 2	Compare
Speech Recognition	Libri-Light test-other	S6000h-n42-τ2 → 0.1	ABX-across	20.6	# 2	Compare

Methods

Add Remove

BiLSTM • Gumbel Softmax • LSTM • Sigmoid Activation • Tanh Activation

Edit Social Preview

Improving Unsupervised Sparsespeech Acoustic Models with Categorical Reparameterization

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove