no code implementations • 6 Jun 2023 • Abishek Komma, Nagesh Panyam Chandrasekarasastry, Timothy Leffel, Anuj Goyal, Angeliki Metallinou, Spyros Matsoukas, Aram Galstyan
Measurement of interaction quality is a critical task for the improvement of spoken dialog systems.
no code implementations • 27 Jun 2022 • Rahil Parikh, Harshavardhan Sundar, Ming Sun, Chao Wang, Spyros Matsoukas
We conclude that this improvement in ASC performance comes from the regularization effect of using AET and not from the network's improved ability to discern between acoustic events.
no code implementations • 22 Mar 2022 • Meng Feng, Chieh-Chi Kao, Qingming Tang, Ming Sun, Viktor Rozgic, Spyros Matsoukas, Chao Wang
Standard acoustic event classification (AEC) solutions require large-scale collection of data from client devices for model optimization.
no code implementations • 4 Mar 2021 • Han Li, Sunghyun Park, Aswarth Dara, Jinseok Nam, Sungjin Lee, Young-Bum Kim, Spyros Matsoukas, Ruhi Sarikaya
Ensuring model robustness or resilience in the skill routing component is an important problem since skills may dynamically change their subscription in the ontology after the skill routing model has been deployed to production.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 12 Feb 2021 • Mao Li, Bo Yang, Joshua Levy, Andreas Stolcke, Viktor Rozgic, Spyros Matsoukas, Constantinos Papayiannis, Daniel Bone, Chao Wang
Speech emotion recognition (SER) is a key technology to enable more natural human-machine communication.
Ranked #4 on
Speech Emotion Recognition
on MSP-Podcast (Dominance)
(using extra training data)
no code implementations • EMNLP 2021 • Sunghyun Park, Han Li, Ameen Patel, Sidharth Mudgal, Sungjin Lee, Young-Bum Kim, Spyros Matsoukas, Ruhi Sarikaya
Natural Language Understanding (NLU) is an established component within a conversational AI or digital assistant system, and it is responsible for producing semantic understanding of a user request.
no code implementations • 13 Oct 2020 • Yixin Gao, Yuriy Mishchenko, Anish Shah, Spyros Matsoukas, Shiv Vitaladevuni
Wake word (WW) spotting is challenging in far-field not only because of the interference in signal transmission but also the complexity in acoustic environments.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Praveen Kumar Bodigutla, Aditya Tiwari, Josep Valls Vargas, Lazaros Polymenakos, Spyros Matsoukas
Dialogue level quality estimation is vital for optimizing data driven dialogue management.
no code implementations • WS 2020 • Longshaokan Wang, Maryam Fazel-Zarandi, Aditya Tiwari, Spyros Matsoukas, Lazaros Polymenakos
Speech-based virtual assistants, such as Amazon Alexa, Google assistant, and Apple Siri, typically convert users' audio signals to text data through automatic speech recognition (ASR) and feed the text to downstream dialog models for natural language understanding and response generation.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
no code implementations • 21 Feb 2020 • Bowen Shi, Ming Sun, Krishna C. Puvvada, Chieh-Chi Kao, Spyros Matsoukas, Chao Wang
We study few-shot acoustic event detection (AED) in this paper.
no code implementations • 18 Nov 2019 • Praveen Kumar Bodigutla, Lazaros Polymenakos, Spyros Matsoukas
To address these gaps, we created a new Response Quality annotation scheme, introduced five new domain-independent feature sets and experimented with six machine learning models to estimate User Satisfaction at both turn and dialogue level.
no code implementations • 8 Nov 2019 • Maryam Fazel-Zarandi, Longshaokan Wang, Aditya Tiwari, Spyros Matsoukas
Training dialog policies for speech-based virtual assistants requires a plethora of conversational data.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 19 Aug 2019 • Praveen Kumar Bodigutla, Longshaokan Wang, Kate Ridgeway, Joshua Levy, Swanand Joshi, Alborz Geramifard, Spyros Matsoukas
An automated metric to evaluate dialogue quality is vital for optimizing data driven dialogue management.
no code implementations • 1 Jul 2019 • Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang
Acoustic Event Detection (AED), aiming at detecting categories of events based on audio signals, has found application in many intelligent systems.
no code implementations • NIPS Workshop CDNNRIA 2018 • Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang
In this paper, we present a compression approach based on the combination of low-rank matrix factorization and quantization training, to reduce complexity for neural network based acoustic event detection (AED) models.
no code implementations • 29 Apr 2019 • Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang
This paper presents our work of training acoustic event detection (AED) models using unlabeled dataset.
no code implementations • 26 Oct 2018 • Sanchit Agarwal, Rahul Goel, Tagyoung Chung, Abhishek Sethi, Arindam Mandal, Spyros Matsoukas
Typical spoken language understanding systems provide narrow semantic parses using a domain-specific ontology.
no code implementations • 19 Oct 2018 • Yuriy Mishchenko, Yusuf Goren, Ming Sun, Chris Beauchene, Spyros Matsoukas, Oleg Rybakov, Shiv Naga Prasad Vitaladevuni
We investigate low-bit quantization to reduce computational cost of deep neural network (DNN) based keyword spotting (KWS).
no code implementations • NAACL 2019 • Stanislav Peshterliev, John Kearney, Abhyuday Jagannatha, Imre Kiss, Spyros Matsoukas
We explore active learning (AL) for improving the accuracy of new domains in a natural language understanding (NLU) system.
no code implementations • 25 Sep 2018 • Chengwei Su, Rahul Gupta, Shankar Ananthakrishnan, Spyros Matsoukas
An ideal re-ranker will exhibit the following two properties: (a) it should prefer the most relevant hypothesis for the given input as the top hypothesis and, (b) the interpretation scores corresponding to each hypothesis produced by the re-ranker should be calibrated.
no code implementations • 7 Aug 2018 • Sri Harish Mallidi, Roland Maas, Kyle Goehner, Ariya Rastrow, Spyros Matsoukas, Björn Hoffmeister
In this work, we propose a classifier for distinguishing device-directed queries from background speech in the context of interactions with voice assistants.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • NAACL 2018 • Thomas Kollar, Danielle Berry, Lauren Stuart, Karolina Owczarzak, Tagyoung Chung, Lambert Mathias, Michael Kayser, Bradford Snow, Spyros Matsoukas
This paper introduces a meaning representation for spoken language understanding.
no code implementations • NAACL 2018 • Anuj Goyal, Angeliki Metallinou, Spyros Matsoukas
Fast expansion of natural language functionality of intelligent virtual agents is critical for achieving engaging and informative interactions.
no code implementations • 5 May 2017 • Ming Sun, Anirudh Raju, George Tucker, Sankaran Panchapagesan, Geng-Shen Fu, Arindam Mandal, Spyros Matsoukas, Nikko Strom, Shiv Vitaladevuni
Finally, the max-pooling loss trained LSTM initialized with a cross-entropy pre-trained network shows the best performance, which yields $67. 6\%$ relative reduction compared to baseline feed-forward DNN in Area Under the Curve (AUC) measure.