TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	EXTRA DATA	REMOVE
Spoken Language Understanding	Fluent Speech Commands	Pooling classifier pre-trained using force-aligned phoneme and word labels on LibriSpeech	Accuracy (%)	98.8	# 15

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/speech-model-pre-training-for-end-to-end/spoken-language-understanding-on-fluent)](https://paperswithcode.com/sota/spoken-language-understanding-on-fluent?p=speech-model-pre-training-for-end-to-end)`

Speech Model Pre-training for End-to-End Spoken Language Understanding

7 Apr 2019 · Loren Lugosch, Mirco Ravanelli, Patrick Ignoto, Vikrant Singh Tomar, Yoshua Bengio ·

Whereas conventional spoken language understanding (SLU) systems map speech to text, and then text to intent, end-to-end SLU systems map speech directly to intent through a single trainable model. Achieving high accuracy with these end-to-end models without a large amount of training data is difficult. We propose a method to reduce the data requirements of end-to-end SLU in which the model is first pre-trained to predict words and phonemes, thus learning good features for SLU. We introduce a new SLU dataset, Fluent Speech Commands, and show that our method improves performance both when the full dataset is used for training and when only a small subset is used. We also describe preliminary experiments to gauge the model's ability to generalize to new phrases not heard during training.

PDF Abstract

Code

Add Remove Mark official

lorenlugosch/end-to-end-SLU

220

Tasks

Add Remove

Spoken Language Understanding

Datasets

Introduced in the Paper:

Fluent Speech Commands

Used in the Paper:

LibriSpeech

Results from the Paper

Edit

Ranked #15 on Spoken Language Understanding on Fluent Speech Commands (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Uses Extra Training Data	Result	Benchmark
Spoken Language Understanding	Fluent Speech Commands	Pooling classifier pre-trained using force-aligned phoneme and word labels on LibriSpeech	Accuracy (%)	98.8	# 15			Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Speech Model Pre-training for End-to-End Spoken Language Understanding

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove