speech-recognition

1010 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in speech-recognition

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find speech-recognition models and implementations

espnet/espnet

16 papers

7,930

msalhab96/SpeeQ

11 papers

pytorch/fairseq

10 papers

29,377

PaddlePaddle/PaddleSpeech

10 papers

10,235

See all 16 libraries.

Most implemented papers

Most implemented Social Latest No code

Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges

autoliuweijie/FastBERT • • 8 Mar 2021

Mobile devices such as smartphones and autonomous vehicles increasingly rely on deep neural networks (DNNs) to execute complex inference tasks such as image classification and speech recognition, among others.

Paper
Code

ISyNet: Convolutional Neural Networks design for AI accelerator

mindspore-ai/models • • 4 Sep 2021

To address this problem we propose a measure of hardware efficiency of neural architecture search space - matrix efficiency measure (MEM); a search space comprising of hardware-efficient operations; a latency-aware scaling method; and ISyNet - a set of architectures designed to be fast on the specialized neural processing unit (NPU) hardware and accurate at the same time.

Paper
Code

Robust Speech Recognition via Large-Scale Weak Supervision

openai/whisper • • Preprint 2022

We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.

Paper
Code

A Simple Way to Initialize Recurrent Networks of Rectified Linear Units

facebookresearch/salina • • 3 Apr 2015

Learning long term dependencies in recurrent networks is difficult due to vanishing and exploding gradients.

Paper
Code

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

TensorSpeech/TensorFlowASR • • 7 May 2020

We demonstrate that on the widely used LibriSpeech benchmark, ContextNet achieves a word error rate (WER) of 2. 1%/4. 6% without external language model (LM), 1. 9%/4. 1% with LM and 2. 9%/7. 0% with only 10M parameters on the clean/noisy LibriSpeech test sets.

Paper
Code

Unsupervised Cross-lingual Representation Learning for Speech Recognition

huggingface/transformers • • 24 Jun 2020

This paper presents XLSR which learns cross-lingual speech representations by pretraining a single model from the raw waveform of speech in multiple languages.

Paper
Code

First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs

PaddlePaddle/PaddleSpeech • • 12 Aug 2014

This approach to decoding enables first-pass speech recognition with a language model, completely unaided by the cumbersome infrastructure of HMM-based systems.

Paper
Code

An Overview of Multi-Task Learning in Deep Neural Networks

shenweichen/DeepCTR • • 15 Jun 2017

Multi-task learning (MTL) has led to successes in many applications of machine learning, from natural language processing and speech recognition to computer vision and drug discovery.

Paper
Code

Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss

upskyy/Transformer-Transducer • • 7 Feb 2020

We present results on the LibriSpeech dataset showing that limiting the left context for self-attention in the Transformer layers makes decoding computationally tractable for streaming, with only a slight degradation in accuracy.

Paper
Code

Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition

PaddlePaddle/PaddleSpeech • • 10 Dec 2020

In this paper, we present a novel two-pass approach to unify streaming and non-streaming end-to-end (E2E) speech recognition in a single model.

Paper
Code

speech-recognition

Benchmarks Add a Result

Libraries

Most implemented papers

Content

Benchmarks

Add a Result