Search Results for author: Ian Lane

Found 28 papers, 3 papers with code

Online Continual Learning of End-to-End Speech Recognition Models

no code implementations11 Jul 2022 Muqiao Yang, Ian Lane, Shinji Watanabe

Continual Learning, also known as Lifelong Learning, aims to continually learn from new data as it becomes available.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Learning Question-Guided Video Representation for Multi-Turn Video Question Answering

no code implementations WS 2019 Guan-Lin Chao, Abhinav Rastogi, Semih Yavuz, Dilek Hakkani-Tür, Jindong Chen, Ian Lane

Understanding and conversing about dynamic scenes is one of the key capabilities of AI agents that navigate the environment and convey useful information to humans.

Navigate Question Answering +2

BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer

1 code implementation5 Jul 2019 Guan-Lin Chao, Ian Lane

We focus on a specific condition, where the ontology is unknown to the state tracker, but the target slot value (except for none and dontcare), possibly unseen during training, can be found as word segment in the dialogue context.

Dialogue State Tracking

Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments

no code implementations13 Jun 2019 Guan-Lin Chao, William Chan, Ian Lane

Speech recognition in cocktail-party environments remains a significant challenge for state-of-the-art speech recognition systems, as it is extremely difficult to extract an acoustic signal of an individual speaker from a background of overlapping speech with similar frequency and temporal characteristics.

speech-recognition Speech Recognition

Speaker Diarization With Lexical Information

no code implementations27 Nov 2018 Tae Jin Park, Kyu Han, Ian Lane, Panayiotis Georgiou

This work presents a novel approach to leverage lexical information for speaker diarization.

Clustering speaker-diarization +1

End-to-End Learning of Task-Oriented Dialogs

no code implementations NAACL 2018 Bing Liu, Ian Lane

In this thesis proposal, we address the limitations of conventional pipeline design of task-oriented dialog systems and propose end-to-end learning solutions.

Multi-Task Learning Spoken Language Understanding

Adversarial Learning of Task-Oriented Neural Dialog Models

no code implementations WS 2018 Bing Liu, Ian Lane

We further discuss the covariate shift problem in online adversarial dialog learning and show how we can address that with partial access to user feedback.

Dialog Learning Reinforcement Learning (RL)

The CAPIO 2017 Conversational Speech Recognition System

no code implementations29 Dec 2017 Kyu J. Han, Akshay Chandrashekaran, Jungsuk Kim, Ian Lane

This method was applied with the CallHome training corpus and improved individual system performances by on average 6. 1% (relative) against the CallHome portion of the evaluation set with no performance loss on the Switchboard portion.

Image Classification speech-recognition +1

Multi-Domain Adversarial Learning for Slot Filling in Spoken Language Understanding

no code implementations30 Nov 2017 Bing Liu, Ian Lane

Model that produces such shared representations can be combined with models trained on individual domain SLU data to reduce the amount of training samples required for developing a new domain.

slot-filling Slot Filling +1

Customized Nonlinear Bandits for Online Response Selection in Neural Conversation Models

no code implementations22 Nov 2017 Bing Liu, Tong Yu, Ian Lane, Ole J. Mengshoel

Moreover, we report encouraging response selection performance of the proposed neural bandit model using the Recall@k metric for a small set of online training samples.

Multi-Armed Bandits Response Generation +2

Iterative Policy Learning in End-to-End Trainable Task-Oriented Neural Dialog Models

no code implementations18 Sep 2017 Bing Liu, Ian Lane

In this paper, we present a deep reinforcement learning (RL) framework for iterative dialog policy optimization in end-to-end task-oriented dialog systems.

Reinforcement Learning (RL)

An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog

no code implementations20 Aug 2017 Bing Liu, Ian Lane

We present a novel end-to-end trainable neural network model for task-oriented dialog systems.

dialog state tracking

Dialog Context Language Modeling with Recurrent Neural Networks

no code implementations15 Jan 2017 Bing Liu, Ian Lane

In this work, we propose contextual language models that incorporate dialog level discourse information into language modeling.

Language Modelling

An Approach for Self-Training Audio Event Detectors Using Web Data

no code implementations20 Sep 2016 Benjamin Elizalde, Ankit Shah, Siddharth Dalmia, Min Hun Lee, Rohan Badlani, Anurag Kumar, Bhiksha Raj, Ian Lane

The audio event detectors are trained on the labeled audio and ran on the unlabeled audio downloaded from YouTube.

Event Detection

Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling

6 code implementations6 Sep 2016 Bing Liu, Ian Lane

Attention-based encoder-decoder neural network models have recently shown promising results in machine translation and speech recognition.

intent-classification Intent Classification +3

Joint Online Spoken Language Understanding and Language Modeling with Recurrent Neural Networks

no code implementations WS 2016 Bing Liu, Ian Lane

On SLU tasks, our joint model outperforms the independent task training model by 22. 3% on intent detection error rate, with slight degradation on slot filling F1 score.

Benchmarking Intent Detection +4

AudioPairBank: Towards A Large-Scale Tag-Pair-Based Audio Content Analysis

no code implementations13 Jul 2016 Sebastian Sager, Benjamin Elizalde, Damian Borth, Christian Schulze, Bhiksha Raj, Ian Lane

One contribution is the previously unavailable documentation of the challenges and implications of collecting audio recordings with these type of labels.

TAG

City-Identification of Flickr Videos Using Semantic Acoustic Features

no code implementations12 Jul 2016 Benjamin Elizalde, Guan-Lin Chao, Ming Zeng, Ian Lane

In particular, we present a method to compute and use semantic acoustic features to perform city-identification and the features show semantic evidence of the identification.

Environmental Noise Embeddings for Robust Speech Recognition

no code implementations11 Jan 2016 Suyoun Kim, Bhiksha Raj, Ian Lane

We propose a novel deep neural network architecture for speech recognition that explicitly employs knowledge of the background environmental noise within a deep neural network acoustic model.

Management Multi-Task Learning +2

Recurrent Models for Auditory Attention in Multi-Microphone Distance Speech Recognition

no code implementations19 Nov 2015 Suyoun Kim, Ian Lane

Integration of multiple microphone data is one of the key ways to achieve robust speech recognition in noisy environments or when the speaker is located at some distance from the input device.

Robust Speech Recognition Speech Enhancement +1

Deep Recurrent Neural Networks for Acoustic Modelling

no code implementations7 Apr 2015 William Chan, Ian Lane

We present a novel deep Recurrent Neural Network (RNN) model for acoustic modelling in Automatic Speech Recognition (ASR).

Acoustic Modelling Automatic Speech Recognition +2

Transferring Knowledge from a RNN to a DNN

no code implementations7 Apr 2015 William Chan, Nan Rosemary Ke, Ian Lane

The small DNN trained on the soft RNN alignments achieved a 3. 93 WER on the Wall Street Journal (WSJ) eval92 task compared to a baseline 4. 54 WER or more than 13% relative improvement.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Cannot find the paper you are looking for? You can Submit a new open access paper.