Keyword Spotting
92 papers with code • 10 benchmarks • 8 datasets
In speech processing, keyword spotting deals with the identification of keywords in utterances.
( Image credit: Simon Grest )
Libraries
Use these libraries to find Keyword Spotting models and implementationsDatasets
Latest papers
AraSpot: Arabic Spoken Command Spotting
Spoken keyword spotting (KWS) is the task of identifying a keyword in an audio stream and is widely used in smart devices at the edge in order to activate voice assistants and perform hands-free tasks.
LipLearner: Customizable Silent Speech Interactions on Mobile Devices
Silent speech interface is a promising technology that enables private communications in natural language.
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification
Transformers, which were originally developed for natural language processing, have recently generated significant interest in the computer vision and audio communities due to their flexibility in learning long-range relationships.
BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance
We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25. 1x speedup and 20. 2x storage-saving on edge hardware.
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models
To overcome these limitations, the present study explores and applies efficient transfer learning methods in the audio domain.
MAST: Multiscale Audio Spectrogram Transformers
We present Multiscale Audio Spectrogram Transformer (MAST) for audio classification, which brings the concept of multiscale feature hierarchies to the Audio Spectrogram Transformer (AST).
WeKws: A production first small-footprint end-to-end Keyword Spotting Toolkit
Keyword spotting (KWS) enables speech-based user interaction and gradually becomes an indispensable component of smart devices.
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input
We propose a new method, Masked Modeling Duo (M2D), that learns representations directly while obtaining training signals using only masked patches.
Improving Label-Deficient Keyword Spotting Through Self-Supervised Pretraining
This paper explores the effectiveness of SSL on small models for KWS and establishes that SSL can enhance the performance of small KWS models when labelled data is scarce.
SiDi KWS: A Large-Scale Multilingual Dataset for Keyword Spotting
Keyword spotting (KWS) has become a hot topic in speech processing due to the rise of commercial applications based on voice command detection, such as voice assistants.