no code implementations • 11 Jul 2022 • Muqiao Yang, Ian Lane, Shinji Watanabe
Continual Learning, also known as Lifelong Learning, aims to continually learn from new data as it becomes available.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
2 code implementations • 6 Jul 2022 • Yifan Peng, Siddharth Dalmia, Ian Lane, Shinji Watanabe
Conformer has proven to be effective in many speech processing tasks.
no code implementations • WS 2019 • Guan-Lin Chao, Abhinav Rastogi, Semih Yavuz, Dilek Hakkani-Tür, Jindong Chen, Ian Lane
Understanding and conversing about dynamic scenes is one of the key capabilities of AI agents that navigate the environment and convey useful information to humans.
1 code implementation • 5 Jul 2019 • Guan-Lin Chao, Ian Lane
We focus on a specific condition, where the ontology is unknown to the state tracker, but the target slot value (except for none and dontcare), possibly unseen during training, can be found as word segment in the dialogue context.
no code implementations • 13 Jun 2019 • Guan-Lin Chao, William Chan, Ian Lane
Speech recognition in cocktail-party environments remains a significant challenge for state-of-the-art speech recognition systems, as it is extremely difficult to extract an acoustic signal of an individual speaker from a background of overlapping speech with similar frequency and temporal characteristics.
no code implementations • 27 Nov 2018 • Tae Jin Park, Kyu Han, Ian Lane, Panayiotis Georgiou
This work presents a novel approach to leverage lexical information for speaker diarization.
no code implementations • 7 Oct 2018 • Ming Zeng, Haoxiang Gao, Tong Yu, Ole J. Mengshoel, Helge Langseth, Ian Lane, Xiaobing Liu
To address these issues, we propose two attention models for human activity recognition: temporal attention and sensor attention.
no code implementations • NAACL 2018 • Bing Liu, Ian Lane
In this thesis proposal, we address the limitations of conventional pipeline design of task-oriented dialog systems and propose end-to-end learning solutions.
no code implementations • WS 2018 • Bing Liu, Ian Lane
We further discuss the covariate shift problem in online adversarial dialog learning and show how we can address that with partial access to user feedback.
no code implementations • 22 Jan 2018 • Ming Zeng, Tong Yu, Xiao Wang, Le T. Nguyen, Ole J. Mengshoel, Ian Lane
Labeled data used for training activity recognition classifiers are usually limited in terms of size and diversity.
no code implementations • 29 Dec 2017 • Kyu J. Han, Akshay Chandrashekaran, Jungsuk Kim, Ian Lane
This method was applied with the CallHome training corpus and improved individual system performances by on average 6. 1% (relative) against the CallHome portion of the evaluation set with no performance loss on the Switchboard portion.
no code implementations • 30 Nov 2017 • Bing Liu, Ian Lane
Model that produces such shared representations can be combined with models trained on individual domain SLU data to reduce the amount of training samples required for developing a new domain.
no code implementations • 22 Nov 2017 • Bing Liu, Tong Yu, Ian Lane, Ole J. Mengshoel
Moreover, we report encouraging response selection performance of the proposed neural bandit model using the Recall@k metric for a small set of online training samples.
no code implementations • 18 Sep 2017 • Bing Liu, Ian Lane
In this paper, we present a deep reinforcement learning (RL) framework for iterative dialog policy optimization in end-to-end task-oriented dialog systems.
no code implementations • 20 Aug 2017 • Bing Liu, Ian Lane
We present a novel end-to-end trainable neural network model for task-oriented dialog systems.
no code implementations • 15 Jan 2017 • Bing Liu, Ian Lane
In this work, we propose contextual language models that incorporate dialog level discourse information into language modeling.
no code implementations • 20 Sep 2016 • Benjamin Elizalde, Ankit Shah, Siddharth Dalmia, Min Hun Lee, Rohan Badlani, Anurag Kumar, Bhiksha Raj, Ian Lane
The audio event detectors are trained on the labeled audio and ran on the unlabeled audio downloaded from YouTube.
no code implementations • WS 2016 • Bing Liu, Ian Lane
On SLU tasks, our joint model outperforms the independent task training model by 22. 3% on intent detection error rate, with slight degradation on slot filling F1 score.
Ranked #3 on
Intent Detection
on ATIS
6 code implementations • 6 Sep 2016 • Bing Liu, Ian Lane
Attention-based encoder-decoder neural network models have recently shown promising results in machine translation and speech recognition.
Ranked #2 on
Intent Detection
on ATIS
no code implementations • 13 Jul 2016 • Sebastian Sager, Benjamin Elizalde, Damian Borth, Christian Schulze, Bhiksha Raj, Ian Lane
One contribution is the previously unavailable documentation of the challenges and implications of collecting audio recordings with these type of labels.
no code implementations • 12 Jul 2016 • Benjamin Elizalde, Guan-Lin Chao, Ming Zeng, Ian Lane
In particular, we present a method to compute and use semantic acoustic features to perform city-identification and the features show semantic evidence of the identification.
no code implementations • 11 Jan 2016 • Suyoun Kim, Bhiksha Raj, Ian Lane
We propose a novel deep neural network architecture for speech recognition that explicitly employs knowledge of the background environmental noise within a deep neural network acoustic model.
no code implementations • 19 Nov 2015 • Suyoun Kim, Ian Lane
Integration of multiple microphone data is one of the key ways to achieve robust speech recognition in noisy environments or when the speaker is located at some distance from the input device.
no code implementations • 7 Apr 2015 • William Chan, Nan Rosemary Ke, Ian Lane
The small DNN trained on the soft RNN alignments achieved a 3. 93 WER on the Wall Street Journal (WSJ) eval92 task compared to a baseline 4. 54 WER or more than 13% relative improvement.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 7 Apr 2015 • William Chan, Ian Lane
We present a novel deep Recurrent Neural Network (RNN) model for acoustic modelling in Automatic Speech Recognition (ASR).
Ranked #8 on
Speech Recognition
on WSJ eval92