Browse > Speech > Speech Recognition

Speech Recognition

204 papers with code ยท Speech

Speech recognition is the task of recognising speech within audio and converting it into text.

( Image credit: SpecAugment )

Leaderboards

Latest papers without code

Improved Training Techniques for Online Neural Machine Translation

ICLR 2020

We investigate the sensitivity of such models to the value of k that is used during training and when deploying the model, and the effect of updating the hidden states in transformer models as new source tokens are read.

MACHINE TRANSLATION SPEECH RECOGNITION

Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps

ICLR 2020

Modern neural network architectures use structured linear transformations, such as low-rank matrices, sparse matrices, permutations, and the Fourier transform, to improve inference speed and reduce memory usage compared to general linear maps.

IMAGE CLASSIFICATION SPEECH RECOGNITION

Top-down training for neural networks

ICLR 2020

Interpreting the top layers as a classifier and the lower layers a feature extractor, one can hypothesize that unwanted network convergence may occur when the classifier has overfit with respect to the feature extractor.

SPEECH RECOGNITION

Unsupervised Learning of Efficient and Robust Speech Representations

ICLR 2020

We present an unsupervised method for learning speech representations based on a bidirectional contrastive predictive coding that implicitly discovers phonetic structure from large-scale corpora of unlabelled raw audio signals.

SPEECH RECOGNITION

vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations

ICLR 2020

We propose vq-wav2vec to learn discrete representations of audio segments through a wav2vec-style self-supervised context prediction task.

SPEECH RECOGNITION

Learning Video Representations using Contrastive Bidirectional Transformer

ICLR 2020

This paper proposes a self-supervised learning approach for video features that results in significantly improved performance on downstream tasks (such as video classification, captioning and segmentation) compared to existing methods.

SPEECH RECOGNITION VIDEO CLASSIFICATION

Generating Robust Audio Adversarial Examples using Iterative Proportional Clipping

ICLR 2020

We then impose a constraint on the perturbation at the positions with lower sound intensity across the time domain to eliminate the perceptible noise during the silent periods or pauses.

SPEECH RECOGNITION

AdaScale SGD: A Scale-Invariant Algorithm for Distributed Training

ICLR 2020

When using distributed training to speed up stochastic gradient descent, learning rates must adapt to new scales in order to maintain training effectiveness.

IMAGE CLASSIFICATION MACHINE TRANSLATION OBJECT DETECTION SPEECH RECOGNITION

On Neural Phone Recognition of Mixed-Source ECoG Signals

12 Dec 2019

The emerging field of neural speech recognition (NSR) using electrocorticography has recently attracted remarkable research interest for studying how human brains recognize speech in quiet and noisy surroundings.

SPEECH RECOGNITION

SpecAugment on Large Scale Datasets

11 Dec 2019

Recently, SpecAugment, an augmentation scheme for automatic speech recognition that acts directly on the spectrogram of input utterances, has shown to be highly effective in enhancing the performance of end-to-end networks on public datasets.

SPEECH RECOGNITION