no code implementations • NAACL 2016 • Paul Crook, Alex Marin, Vipul Agarwal, Khushboo Aggarwal, Tasos Anastasakos, Ravi Bikkula, Daniel Boies, Asli Celikyilmaz, Ch, Senthilkumar ramohan, Zhaleh Feizollahi, Roman Holenstein, Minwoo Jeong, Omar Khan, Young-Bum Kim, Elizabeth Krawczyk, Xiaohu Liu, Danko Panic, Vasiliy Radostev, Nikhil Ramesh, Jean-Phillipe Robichaud, Alex Rochette, re, Logan Stromberg, Ruhi Sarikaya
no code implementations • COLING 2016 • Young-Bum Kim, Karl Stratos, Ruhi Sarikaya
In many applications such as personal digital assistants, there is a constant need for new domains to increase the system{'}s coverage of user queries.
no code implementations • COLING 2016 • Young-Bum Kim, Karl Stratos, Ruhi Sarikaya
Popular techniques for domain adaptation such as the feature augmentation method of Daum{\'e} III (2009) have mostly been considered for sparse binary-valued features, but not for dense real-valued features such as those used in neural networks.
no code implementations • ACL 2017 • Young-Bum Kim, Karl Stratos, Dongchan Kim
When given domain K + 1, our model uses a weighted combination of the K domain experts{'} feedback along with its own opinion to make predictions on the new domain.
no code implementations • ACL 2017 • Young-Bum Kim, Karl Stratos, Dongchan Kim
Both cause a distribution mismatch between training and evaluation, leading to a model that overfits the flawed training data and performs poorly on the test data.
no code implementations • EMNLP 2017 • Joo-Kyung Kim, Young-Bum Kim, Ruhi Sarikaya, Eric Fosler-Lussier
Evaluating on POS datasets from 14 languages in the Universal Dependencies corpus, we show that the proposed transfer learning model improves the POS tagging performance of the target languages without exploiting any linguistic knowledge between the source language and the target language.
no code implementations • 29 Nov 2017 • Young-Bum Kim, Sungjin Lee, Ruhi Sarikaya
In multi-turn dialogs, natural language understanding models can introduce obvious errors by being blind to contextual information.
no code implementations • 16 Jan 2018 • Young-Bum Kim, Sungjin Lee, Karl Stratos
In practice, most spoken language understanding systems process user input in a pipelined manner; first domain is predicted, then intent and semantic slots are inferred according to the semantic frames of the predicted domain.
no code implementations • NAACL 2018 • Young-Bum Kim, Dongchan Kim, Joo-Kyung Kim, Ruhi Sarikaya
Intelligent personal digital assistants (IPDAs), a popular real-life application with spoken language understanding capabilities, can cover potentially thousands of overlapping domains for natural language understanding, and the task of finding the best domain to handle an utterance becomes a challenging problem on a large scale.
no code implementations • 22 Apr 2018 • Young-Bum Kim, Dongchan Kim, Anjishnu Kumar, Ruhi Sarikaya
In this paper, we explore the task of mapping spoken language utterances to one of thousands of natural language understanding domains in intelligent personal digital assistants (IPDAs).
no code implementations • COLING 2018 • Chanhee Lee, Young-Bum Kim, Dongyub Lee, Heuiseok Lim
Generating character-level features is an important step for achieving good results in various natural language processing tasks.
no code implementations • COLING 2018 • Andrew Matteson, Chanhee Lee, Young-Bum Kim, Heuiseok Lim
Due to the fact that Korean is a highly agglutinative, character-rich language, previous work on Korean morphological analysis typically employs the use of sub-character features known as graphemes or otherwise utilizes comprehensive prior linguistic knowledge (i. e., a dictionary of known morphological transformation forms, or actions).
no code implementations • 29 Jun 2018 • Joo-Kyung Kim, Young-Bum Kim
In domain classification for spoken dialog systems, correct detection of out-of-domain (OOD) utterances is crucial because it reduces confusion and unnecessary interaction costs between users and the systems.
no code implementations • ACL 2018 • Young-Bum Kim, Dongchan Kim, Anjishnu Kumar, Ruhi Sarikaya
In this paper, we explore the task of mapping spoken language utterances to one of thousands of natural language understanding domains in intelligent personal digital assistants (IPDAs).
no code implementations • 13 Dec 2018 • JIhwan Lee, Dongchan Kim, Ruhi Sarikaya, Young-Bum Kim
Our proposed model learns the vector representation of intents based on the slots tied to these intents by aggregating the representations of the slots.
no code implementations • EMNLP 2018 • Joo-Kyung Kim, Young-Bum Kim
The attention weights are explicitly encouraged to be similar to the corresponding elements of the ground-truth's one-hot vector by supervised attention, and the attention information of the other enabled domains is leveraged through self-distillation.
no code implementations • NAACL 2019 • Jihwan Lee, Ruhi Sarikaya, Young-Bum Kim
In this paper, we introduce an approach for leveraging available data across multiple locales sharing the same language to 1) improve domain classification model accuracy in Spoken Language Understanding and user experience even if new locales do not have sufficient data and 2) reduce the cost of scaling the domain classifier to a large number of locales.
no code implementations • NAACL 2019 • Han Li, JIhwan Lee, Sidharth Mudgal, Ruhi Sarikaya, Young-Bum Kim
This is a major component in mainstream IPDAs in industry.
no code implementations • 8 Mar 2020 • Joo-Kyung Kim, Young-Bum Kim
In large-scale domain classification, an utterance can be handled by multiple domains with overlapped capabilities.
no code implementations • 29 May 2020 • Dookun Park, Hao Yuan, Dongmin Kim, Yinglei Zhang, Matsoukas Spyros, Young-Bum Kim, Ruhi Sarikaya, Edward Guo, Yuan Ling, Kevin Quinn, Pham Hung, Benjamin Yao, Sungjin Lee
An widely used approach to tackle this is to collect human annotation data and use them for evaluation or modeling.
no code implementations • NAACL 2021 • Mohammad Kachuee, Hao Yuan, Young-Bum Kim, Sungjin Lee
Moreover, a powerful satisfaction model can be used as an objective function that a conversational agent continuously optimizes for.
no code implementations • EMNLP 2021 • Sunghyun Park, Han Li, Ameen Patel, Sidharth Mudgal, Sungjin Lee, Young-Bum Kim, Spyros Matsoukas, Ruhi Sarikaya
Natural Language Understanding (NLU) is an established component within a conversational AI or digital assistant system, and it is responsible for producing semantic understanding of a user request.
no code implementations • 1 Mar 2021 • Ziming Li, Dookun Park, Julia Kiseleva, Young-Bum Kim, Sungjin Lee
Digital assistants are experiencing rapid growth due to their ability to assist users with day-to-day tasks where most dialogues are happening multi-turn.
no code implementations • 4 Mar 2021 • Han Li, Sunghyun Park, Aswarth Dara, Jinseok Nam, Sungjin Lee, Young-Bum Kim, Spyros Matsoukas, Ruhi Sarikaya
Ensuring model robustness or resilience in the skill routing component is an important problem since skills may dynamically change their subscription in the ontology after the skill routing model has been deployed to production.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 26 Apr 2021 • Cheng Wang, Sun Kim, Taiwoo Park, Sajal Choudhary, Sunghyun Park, Young-Bum Kim, Ruhi Sarikaya, Sungjin Lee
We have been witnessing the usefulness of conversational AI systems such as Siri and Alexa, directly impacting our daily lives.
no code implementations • Findings (ACL) 2021 • Cheng Wang, Sungjin Lee, Sunghyun Park, Han Li, Young-Bum Kim, Ruhi Sarikaya
Real-world machine learning systems are achieving remarkable performance in terms of coarse-grained metrics like overall accuracy and F-1 score.
1 code implementation • ACL 2021 • Xinnuo Xu, Guoyin Wang, Young-Bum Kim, Sungjin Lee
Natural Language Generation (NLG) is a key component in a task-oriented dialogue system, which converts the structured meaning representation (MR) to the natural language.
no code implementations • 25 Sep 2021 • Joo-Kyung Kim, Guoyin Wang, Sungjin Lee, Young-Bum Kim
A large-scale conversational agent can suffer from understanding user utterances with various ambiguities such as ASR ambiguity, intent ambiguity, and hypothesis ambiguity.