In speech processing, keyword spotting deals with the identification of keywords in utterances.
( Image credit: Simon Grest )
Phoneme boundary detection plays an essential first step for a variety of speech processing applications such as speaker diarization, speech science, keyword spotting, etc.
With the rise of low power speech-enabled devices, there is a growing demand to quickly produce models for recognizing arbitrary sets of keywords.
Chinese keyword spotting is a challenging task as there is no visual blank for Chinese words.
In societies with well developed internet infrastructure, social media is the leading medium of communication for various social issues especially for breaking news situations.
In many scenarios, detecting keywords from natural language queries is sufficient to understand the intent of the user.
Despite the recent successes of deep neural networks, it remains challenging to achieve high precision keyword spotting task (KWS) on resource-constrained devices.
Used for simple commands recognition on devices from smart speakers to mobile phones, keyword spotting systems are everywhere.
Machine Learning systems are vulnerable to adversarial attacks and will highly likely produce incorrect outputs under these attacks.