39 papers with code • 7 benchmarks • 5 datasets
In speech processing, keyword spotting deals with the identification of keywords in utterances.
( Image credit: Simon Grest )
We present a broadcasted residual learning method to achieve high accuracy with small model size and computational load.
Ranked #2 on Keyword Spotting on Google Speech Commands
Towards easily customizable KWS models, we present KeySEM (Keyword Speech EMbedding), a speech embedding model pre-trained on the task of recognizing a large number of keywords.
Keyword spotting aims to identify specific keyword audio utterances.
We propose self-training with noisy student-teacher approach for streaming keyword spotting, that can utilize large-scale unlabeled data and aggressive data augmentation.
To the best of our knowledge, this is the first attempt to examine the Lambda framework within the speech domain and therefore, we unravel further research and development of future speech interfaces based on this architecture.
This paper introduces neural architecture search (NAS) for the automatic discovery of end-to-end keyword spotting (KWS) models in limited resource environments.
Ranked #14 on Keyword Spotting on Google Speech Commands (Google Speech Commands V2 12 metric)
Under this perspective, a probabilistic framework for lexicon-based KWS in text images is presented.
In this work, we introduce SubSpectral Normalization (SSN), which splits the input frequency dimension into several groups (sub-bands) and performs a different normalization for each group.
Ranked #1 on Keyword Spotting on Google Speech Commands (% Test Accuracy metric)