Browse > Speech > Speech Recognition

Speech Recognition

204 papers with code · Speech

Speech recognition is the task of recognising speech within audio and converting it into text.

( Image credit: SpecAugment )

Leaderboards

Latest papers with code

A Resource for Computational Experiments on Mapudungun

4 Dec 2019mingjund/mapudungun-corpus

We present a resource for computational experiments on Mapudungun, a polysynthetic indigenous language spoken in Chile with upwards of 200 thousand speakers.

MACHINE TRANSLATION SPEECH RECOGNITION SPEECH SYNTHESIS

0
04 Dec 2019

Untangling in Invariant Speech Recognition

NeurIPS 2019 schung039/neural_manifolds_replicaMFT

Higher level concepts such as parts-of-speech and context dependence also emerge in the later layers of the network.

SPEECH RECOGNITION

4
01 Dec 2019

Kurdish (Sorani) Speech to Text: Presenting an Experimental Dataset

29 Nov 2019KurdishBLARK/BD-4SK-ASR

We present an experimental dataset, Basic Dataset for Sorani Kurdish Automatic Speech Recognition (BD-4SK-ASR), which we used in the first attempt in developing an automatic speech recognition for Sorani Kurdish.

SPEECH RECOGNITION

1
29 Nov 2019

CAT: CRF-based ASR Toolkit

20 Nov 2019thu-spmi/cat

In this paper, we present a new open source toolkit for automatic speech recognition (ASR), named CAT (CRF-based ASR Toolkit).

END-TO-END SPEECH RECOGNITION SPEECH RECOGNITION

46
20 Nov 2019

Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks

25 Oct 2019alecokas/BiLatticeRNN-Confidence

Experimental results using the IARPA OpenKWS 2016 evaluation system show that the use of additional information yields significant gains in confidence estimation accuracy.

SPEECH RECOGNITION

3
25 Oct 2019

ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit

24 Oct 2019espnet/espnet

Furthermore, the unified design enables the integration of ASR functions with TTS, e. g., ASR-based objective evaluation and semi-supervised learning with both ASR and TTS models.

SPEECH RECOGNITION

1,715
24 Oct 2019

Generative Pre-Training for Speech with Autoregressive Predictive Coding

23 Oct 2019iamyuanchung/Autoregressive-Predictive-Coding

Learning meaningful and general representations from unannotated speech that are applicable to a wide range of tasks remains challenging.

REPRESENTATION LEARNING SPEAKER IDENTIFICATION SPEECH RECOGNITION TRANSFER LEARNING

62
23 Oct 2019

FaSNet: Low-latency Adaptive Beamforming for Multi-microphone Audio Processing

29 Sep 2019yluo42/TAC

Beamforming has been extensively investigated for multi-channel audio processing tasks.

SPEECH ENHANCEMENT SPEECH RECOGNITION

10
29 Sep 2019

Language-Agnostic Syllabification with Neural Sequence Labeling

29 Sep 2019jacobkrantz/lstm-syllabify

The concept of the syllable is cross-linguistic, though formal definitions are rarely agreed upon, even within a language.

CHUNKING NAMED ENTITY RECOGNITION PART-OF-SPEECH TAGGING SPEECH RECOGNITION

3
29 Sep 2019

Espresso: A Fast End-to-end Neural Speech Recognition Toolkit

18 Sep 2019freewym/espresso

We present Espresso, an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning library PyTorch and the popular neural machine translation toolkit fairseq.

DATA AUGMENTATION LANGUAGE MODELLING MACHINE TRANSLATION SPEECH RECOGNITION

551
18 Sep 2019