Search Results for author: Spyros Matsoukas

Found 28 papers, 0 papers with code

Impact of Acoustic Event Tagging on Scene Classification in a Multi-Task Learning Framework

no code implementations27 Jun 2022 Rahil Parikh, Harshavardhan Sundar, Ming Sun, Chao Wang, Spyros Matsoukas

We conclude that this improvement in ASC performance comes from the regularization effect of using AET and not from the network's improved ability to discern between acoustic events.

Acoustic Scene Classification Multi-Task Learning +1

Federated Self-Supervised Learning for Acoustic Event Classification

no code implementations22 Mar 2022 Meng Feng, Chieh-Chi Kao, Qingming Tang, Ming Sun, Viktor Rozgic, Spyros Matsoukas, Chao Wang

Standard acoustic event classification (AEC) solutions require large-scale collection of data from client devices for model optimization.

Classification Continual Learning +3

Neural model robustness for skill routing in large-scale conversational AI systems: A design choice exploration

no code implementations4 Mar 2021 Han Li, Sunghyun Park, Aswarth Dara, Jinseok Nam, Sungjin Lee, Young-Bum Kim, Spyros Matsoukas, Ruhi Sarikaya

Ensuring model robustness or resilience in the skill routing component is an important problem since skills may dynamically change their subscription in the ontology after the skill routing model has been deployed to production.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

A scalable framework for learning from implicit user feedback to improve natural language understanding in large-scale conversational AI systems

no code implementations EMNLP 2021 Sunghyun Park, Han Li, Ameen Patel, Sidharth Mudgal, Sungjin Lee, Young-Bum Kim, Spyros Matsoukas, Ruhi Sarikaya

Natural Language Understanding (NLU) is an established component within a conversational AI or digital assistant system, and it is responsible for producing semantic understanding of a user request.

Natural Language Understanding

Towards Data-efficient Modeling for Wake Word Spotting

no code implementations13 Oct 2020 Yixin Gao, Yuriy Mishchenko, Anish Shah, Spyros Matsoukas, Shiv Vitaladevuni

Wake word (WW) spotting is challenging in far-field not only because of the interference in signal transmission but also the complexity in acoustic environments.

Data Augmentation

Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors

no code implementations WS 2020 Longshaokan Wang, Maryam Fazel-Zarandi, Aditya Tiwari, Spyros Matsoukas, Lazaros Polymenakos

Speech-based virtual assistants, such as Amazon Alexa, Google assistant, and Apple Siri, typically convert users' audio signals to text data through automatic speech recognition (ASR) and feed the text to downstream dialog models for natural language understanding and response generation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Multi-domain Conversation Quality Evaluation via User Satisfaction Estimation

no code implementations18 Nov 2019 Praveen Kumar Bodigutla, Lazaros Polymenakos, Spyros Matsoukas

To address these gaps, we created a new Response Quality annotation scheme, introduced five new domain-independent feature sets and experimented with six machine learning models to estimate User Satisfaction at both turn and dialogue level.

Dialogue Management Management

Compression of Acoustic Event Detection Models With Quantized Distillation

no code implementations1 Jul 2019 Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang

Acoustic Event Detection (AED), aiming at detecting categories of events based on audio signals, has found application in many intelligent systems.

Event Detection Knowledge Distillation +1

Compression of Acoustic Event Detection Models with Low-rank Matrix Factorization and Quantization Training

no code implementations NIPS Workshop CDNNRIA 2018 Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang

In this paper, we present a compression approach based on the combination of low-rank matrix factorization and quantization training, to reduce complexity for neural network based acoustic event detection (AED) models.

Event Detection Quantization

Parsing Coordination for Spoken Language Understanding

no code implementations26 Oct 2018 Sanchit Agarwal, Rahul Goel, Tagyoung Chung, Abhishek Sethi, Arindam Mandal, Spyros Matsoukas

Typical spoken language understanding systems provide narrow semantic parses using a domain-specific ontology.

Spoken Language Understanding

A Re-ranker Scheme for Integrating Large Scale NLU models

no code implementations25 Sep 2018 Chengwei Su, Rahul Gupta, Shankar Ananthakrishnan, Spyros Matsoukas

An ideal re-ranker will exhibit the following two properties: (a) it should prefer the most relevant hypothesis for the given input as the top hypothesis and, (b) the interpretation scores corresponding to each hypothesis produced by the re-ranker should be calibrated.

Natural Language Understanding

Device-directed Utterance Detection

no code implementations7 Aug 2018 Sri Harish Mallidi, Roland Maas, Kyle Goehner, Ariya Rastrow, Spyros Matsoukas, Björn Hoffmeister

In this work, we propose a classifier for distinguishing device-directed queries from background speech in the context of interactions with voice assistants.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Max-Pooling Loss Training of Long Short-Term Memory Networks for Small-Footprint Keyword Spotting

no code implementations5 May 2017 Ming Sun, Anirudh Raju, George Tucker, Sankaran Panchapagesan, Geng-Shen Fu, Arindam Mandal, Spyros Matsoukas, Nikko Strom, Shiv Vitaladevuni

Finally, the max-pooling loss trained LSTM initialized with a cross-entropy pre-trained network shows the best performance, which yields $67. 6\%$ relative reduction compared to baseline feed-forward DNN in Area Under the Curve (AUC) measure.

Small-Footprint Keyword Spotting

Cannot find the paper you are looking for? You can Submit a new open access paper.