Automatic Speech Recognition

501 papers with code • 13 benchmarks • 11 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Automatic Speech Recognition

Dataset	Best Model	Compare
FLEURS	SeamlessM4T Medium	See all
LibriSpeech test-clean	MonoBERT	See all
LibriSpeech test-other	MonoBERT	See all
FLEURS-54	SeamlessM4T Medium	See all

Show all 13 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Automatic Speech Recognition models and implementations

espnet/espnet

13 papers

7,878

NVIDIA/NeMo

8 papers

10,073

k2-fsa/icefall

5 papers

775

facebookresearch/fairseq

3 papers

29,257

See all 17 libraries.

Datasets

Most implemented papers

Most implemented Social Latest No code

Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition

retrocirce/hts-audio-transformer • • 9 Apr 2018

Describes an audio dataset of spoken words designed to help train and evaluate keyword spotting systems.

Paper
Code

SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

mozilla/DeepSpeech • • 18 Apr 2019

On LibriSpeech, we achieve 6. 8% WER on test-other without the use of a language model, and 5. 8% WER with shallow fusion with a language model.

Paper
Code

Conformer: Convolution-augmented Transformer for Speech Recognition

PaddlePaddle/PaddleSpeech • • 16 May 2020

Recently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs).

Paper
Code

Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces

snipsco/snips-nlu • 25 May 2018

This paper presents the machine learning architecture of the Snips Voice Platform, a software solution to perform Spoken Language Understanding on microprocessors typical of IoT devices.

Paper
Code

A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks

hendrycks/error-detection • • 7 Oct 2016

We consider the two related problems of detecting if an example is misclassified or out-of-distribution.

Paper
Code

Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM

Alexander-H-Liu/End-to-end-ASR-Pytorch • • 8 Jun 2017

The CTC network sits on top of the encoder and is jointly trained with the attention-based decoder.

Paper
Code

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

TensorSpeech/TensorFlowASR • • 7 May 2020

We demonstrate that on the widely used LibriSpeech benchmark, ContextNet achieves a word error rate (WER) of 2. 1%/4. 6% without external language model (LM), 1. 9%/4. 1% with LM and 2. 9%/7. 0% with only 10M parameters on the clean/noisy LibriSpeech test sets.

Paper
Code

Neural NILM: Deep Neural Networks Applied to Energy Disaggregation

JackKelly/neuralnilm_prototype • • 23 Jul 2015

Energy disaggregation estimates appliance-by-appliance electricity consumption from a single meter that measures the whole home's electricity demand.

Paper
Code

EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding

yajiemiao/eesen • 29 Jul 2015

The performance of automatic speech recognition (ASR) has improved tremendously due to the application of deep neural networks (DNNs).

Paper
Code

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

sooftware/End-to-end-Speech-Recognition • • 5 Dec 2017

Attention-based encoder-decoder architectures such as Listen, Attend, and Spell (LAS), subsume the acoustic, pronunciation and language model components of a traditional automatic speech recognition (ASR) system into a single neural network.

Paper
Code

Automatic Speech Recognition

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result