speech-recognition
1005 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in speech-recognition
Libraries
Use these libraries to find speech-recognition models and implementationsMost implemented papers
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Self-supervised learning (SSL) achieves great success in speech recognition, while limited exploration has been attempted for other speech processing tasks.
Data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
While the general idea of self-supervised learning is identical across modalities, the actual algorithms and objectives differ widely because they were developed with a single modality in mind.
An exact mapping between the Variational Renormalization Group and Deep Learning
Here, we show that deep learning is intimately related to one of the most important and successful techniques in theoretical physics, the renormalization group (RG).
Neural NILM: Deep Neural Networks Applied to Energy Disaggregation
Energy disaggregation estimates appliance-by-appliance electricity consumption from a single meter that measures the whole home's electricity demand.
EESEN: End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding
The performance of automatic speech recognition (ASR) has improved tremendously due to the application of deep neural networks (DNNs).
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms.
Geometric deep learning on graphs and manifolds using mixture model CNNs
Recently, there has been an increasing interest in geometric deep learning, attempting to generalize deep learning methods to non-Euclidean structured data such as graphs and manifolds, with a variety of applications from the domains of network analysis, computational social science, or computer graphics.
Combining Residual Networks with LSTMs for Lipreading
We propose an end-to-end deep learning architecture for word-level visual speech recognition.
Honk: A PyTorch Reimplementation of Convolutional Neural Networks for Keyword Spotting
We describe Honk, an open-source PyTorch reimplementation of convolutional neural networks for keyword spotting that are included as examples in TensorFlow.
State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Attention-based encoder-decoder architectures such as Listen, Attend, and Spell (LAS), subsume the acoustic, pronunciation and language model components of a traditional automatic speech recognition (ASR) system into a single neural network.