84 papers with code • 8 benchmarks • 8 datasets
In speech processing, keyword spotting deals with the identification of keywords in utterances.
( Image credit: Simon Grest )
These leaderboards are used to track progress in Keyword Spotting
LibrariesUse these libraries to find Keyword Spotting models and implementations
Most implemented papers
Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition
Describes an audio dataset of spoken words designed to help train and evaluate keyword spotting systems.
Hello Edge: Keyword Spotting on Microcontrollers
We train various neural network architectures for keyword spotting published in literature to compare their accuracy and memory/compute requirements.
Keyword Transformer: A Self-Attention Model for Keyword Spotting
The Transformer architecture has been successful across many domains, including natural language processing, computer vision and speech recognition.
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
We present a large-scale comparison of various self-supervised models.
Low-Power Audio Keyword Spotting using Tsetlin Machines
In this paper we explore a TM based keyword spotting (KWS) pipeline to demonstrate low complexity with faster rate of convergence compared to NNs.
READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents
Well established text line segmentation evaluation schemes such as the Detection Rate or Recognition Accuracy demand for binarized data that is annotated on a pixel level.
Honk: A PyTorch Reimplementation of Convolutional Neural Networks for Keyword Spotting
We describe Honk, an open-source PyTorch reimplementation of convolutional neural networks for keyword spotting that are included as examples in TensorFlow.
Deep Residual Learning for Small-Footprint Keyword Spotting
We explore the application of deep residual learning and dilated convolutions to the keyword spotting task, using the recently-released Google Speech Commands Dataset as our benchmark.
Efficient keyword spotting using dilated convolutions and gating
We explore the application of end-to-end stateless temporal modeling to small-footprint keyword spotting as opposed to recurrent networks that model long-term temporal dependencies using internal states.
An End-to-End Architecture for Keyword Spotting and Voice Activity Detection
We propose a single neural network architecture for two tasks: on-line keyword spotting and voice activity detection.