Keyword Spotting

97 papers with code • 10 benchmarks • 8 datasets

In speech processing, keyword spotting deals with the identification of keywords in utterances.

( Image credit: Simon Grest )


Use these libraries to find Keyword Spotting models and implementations

Most implemented papers

Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition

retrocirce/hts-audio-transformer 9 Apr 2018

Describes an audio dataset of spoken words designed to help train and evaluate keyword spotting systems.

Hello Edge: Keyword Spotting on Microcontrollers

ARM-software/ML-KWS-for-MCU 20 Nov 2017

We train various neural network architectures for keyword spotting published in literature to compare their accuracy and memory/compute requirements.

Keyword Transformer: A Self-Attention Model for Keyword Spotting

ARM-software/keyword-transformer 1 Apr 2021

The Transformer architecture has been successful across many domains, including natural language processing, computer vision and speech recognition.

Low-Power Audio Keyword Spotting using Tsetlin Machines

cair/TsetlinMachine 27 Jan 2021

In this paper we explore a TM based keyword spotting (KWS) pipeline to demonstrate low complexity with faster rate of convergence compared to NNs.

READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents

Transkribus/TranskribusBaseLineEvaluationScheme 9 May 2017

Well established text line segmentation evaluation schemes such as the Detection Rate or Recognition Accuracy demand for binarized data that is annotated on a pixel level.

Honk: A PyTorch Reimplementation of Convolutional Neural Networks for Keyword Spotting

castorini/honk 18 Oct 2017

We describe Honk, an open-source PyTorch reimplementation of convolutional neural networks for keyword spotting that are included as examples in TensorFlow.

Deep Residual Learning for Small-Footprint Keyword Spotting

castorini/honk 28 Oct 2017

We explore the application of deep residual learning and dilated convolutions to the keyword spotting task, using the recently-released Google Speech Commands Dataset as our benchmark.

Efficient keyword spotting using dilated convolutions and gating

snipsco/tract 19 Nov 2018

We explore the application of end-to-end stateless temporal modeling to small-footprint keyword spotting as opposed to recurrent networks that model long-term temporal dependencies using internal states.

AST: Audio Spectrogram Transformer

YuanGongND/ast 5 Apr 2021

In the past decade, convolutional neural networks (CNNs) have been widely adopted as the main building block for end-to-end audio classification models, which aim to learn a direct mapping from audio spectrograms to corresponding labels.