Search Results for author: Ian Lane

Found 28 papers, 3 papers with code

Online Continual Learning of End-to-End Speech Recognition Models

no code implementations • 11 Jul 2022 • Muqiao Yang, Ian Lane, Shinji Watanabe

Continual Learning, also known as Lifelong Learning, aims to continually learn from new data as it becomes available.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding

4 code implementations • 6 Jul 2022 • Yifan Peng, Siddharth Dalmia, Ian Lane, Shinji Watanabe

Conformer has proven to be effective in many speech processing tasks.

speech-recognition Speech Recognition +1

7,871

Paper
Code

Learning Question-Guided Video Representation for Multi-Turn Video Question Answering

no code implementations • WS 2019 • Guan-Lin Chao, Abhinav Rastogi, Semih Yavuz, Dilek Hakkani-Tür, Jindong Chen, Ian Lane

Understanding and conversing about dynamic scenes is one of the key capabilities of AI agents that navigate the environment and convey useful information to humans.

Navigate Question Answering +2

Paper
Add Code

BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer

1 code implementation • 5 Jul 2019 • Guan-Lin Chao, Ian Lane

We focus on a specific condition, where the ontology is unknown to the state tracker, but the target slot value (except for none and dontcare), possibly unseen during training, can be found as word segment in the dialogue context.

Dialogue State Tracking

102

Paper
Code

Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments

no code implementations • 13 Jun 2019 • Guan-Lin Chao, William Chan, Ian Lane

Speech recognition in cocktail-party environments remains a significant challenge for state-of-the-art speech recognition systems, as it is extremely difficult to extract an acoustic signal of an individual speaker from a background of overlapping speech with similar frequency and temporal characteristics.

speech-recognition Speech Recognition

Paper
Add Code

Speaker Diarization With Lexical Information

no code implementations • 27 Nov 2018 • Tae Jin Park, Kyu Han, Ian Lane, Panayiotis Georgiou

This work presents a novel approach to leverage lexical information for speaker diarization.

Clustering speaker-diarization +1

Paper
Add Code

Understanding and Improving Recurrent Networks for Human Activity Recognition by Continuous Attention

no code implementations • 7 Oct 2018 • Ming Zeng, Haoxiang Gao, Tong Yu, Ole J. Mengshoel, Helge Langseth, Ian Lane, Xiaobing Liu

To address these issues, we propose two attention models for human activity recognition: temporal attention and sensor attention.

Human Activity Recognition

Paper
Add Code

End-to-End Learning of Task-Oriented Dialogs

no code implementations • NAACL 2018 • Bing Liu, Ian Lane

In this thesis proposal, we address the limitations of conventional pipeline design of task-oriented dialog systems and propose end-to-end learning solutions.

Multi-Task Learning Spoken Language Understanding

Paper
Add Code

Adversarial Learning of Task-Oriented Neural Dialog Models

no code implementations • WS 2018 • Bing Liu, Ian Lane

We further discuss the covariate shift problem in online adversarial dialog learning and show how we can address that with partial access to user feedback.

Dialog Learning Reinforcement Learning (RL)

Paper
Add Code

Semi-Supervised Convolutional Neural Networks for Human Activity Recognition

no code implementations • 22 Jan 2018 • Ming Zeng, Tong Yu, Xiao Wang, Le T. Nguyen, Ole J. Mengshoel, Ian Lane

Labeled data used for training activity recognition classifiers are usually limited in terms of size and diversity.

Feature Engineering Human Activity Recognition

Paper
Add Code

The CAPIO 2017 Conversational Speech Recognition System

no code implementations • 29 Dec 2017 • Kyu J. Han, Akshay Chandrashekaran, Jungsuk Kim, Ian Lane

This method was applied with the CallHome training corpus and improved individual system performances by on average 6. 1% (relative) against the CallHome portion of the evaluation set with no performance loss on the Switchboard portion.

Image Classification speech-recognition +1

Paper
Add Code

Multi-Domain Adversarial Learning for Slot Filling in Spoken Language Understanding

no code implementations • 30 Nov 2017 • Bing Liu, Ian Lane

Model that produces such shared representations can be combined with models trained on individual domain SLU data to reduce the amount of training samples required for developing a new domain.

slot-filling Slot Filling +1

Paper
Add Code

Customized Nonlinear Bandits for Online Response Selection in Neural Conversation Models

no code implementations • 22 Nov 2017 • Bing Liu, Tong Yu, Ian Lane, Ole J. Mengshoel

Moreover, we report encouraging response selection performance of the proposed neural bandit model using the Recall@k metric for a small set of online training samples.

Multi-Armed Bandits Response Generation +2

Paper
Add Code

Iterative Policy Learning in End-to-End Trainable Task-Oriented Neural Dialog Models

no code implementations • 18 Sep 2017 • Bing Liu, Ian Lane

In this paper, we present a deep reinforcement learning (RL) framework for iterative dialog policy optimization in end-to-end task-oriented dialog systems.

Reinforcement Learning (RL)

Paper
Add Code

An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog

no code implementations • 20 Aug 2017 • Bing Liu, Ian Lane

We present a novel end-to-end trainable neural network model for task-oriented dialog systems.

dialog state tracking

Paper
Add Code

Dialog Context Language Modeling with Recurrent Neural Networks

no code implementations • 15 Jan 2017 • Bing Liu, Ian Lane

In this work, we propose contextual language models that incorporate dialog level discourse information into language modeling.

Language Modelling

Paper
Add Code

An Approach for Self-Training Audio Event Detectors Using Web Data

no code implementations • 20 Sep 2016 • Benjamin Elizalde, Ankit Shah, Siddharth Dalmia, Min Hun Lee, Rohan Badlani, Anurag Kumar, Bhiksha Raj, Ian Lane

The audio event detectors are trained on the labeled audio and ran on the unlabeled audio downloaded from YouTube.

Event Detection

Paper
Add Code

Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling

6 code implementations • 6 Sep 2016 • Bing Liu, Ian Lane

Attention-based encoder-decoder neural network models have recently shown promising results in machine translation and speech recognition.

Ranked #3 on Intent Detection on ATIS

intent-classification Intent Classification +3

197

Paper
Code

Joint Online Spoken Language Understanding and Language Modeling with Recurrent Neural Networks

no code implementations • WS 2016 • Bing Liu, Ian Lane

On SLU tasks, our joint model outperforms the independent task training model by 22. 3% on intent detection error rate, with slight degradation on slot filling F1 score.

Ranked #4 on Intent Detection on ATIS

Benchmarking Intent Detection +4

Paper
Add Code

AudioPairBank: Towards A Large-Scale Tag-Pair-Based Audio Content Analysis

no code implementations • 13 Jul 2016 • Sebastian Sager, Benjamin Elizalde, Damian Borth, Christian Schulze, Bhiksha Raj, Ian Lane

One contribution is the previously unavailable documentation of the challenges and implications of collecting audio recordings with these type of labels.

TAG

Paper
Add Code

City-Identification of Flickr Videos Using Semantic Acoustic Features

no code implementations • 12 Jul 2016 • Benjamin Elizalde, Guan-Lin Chao, Ming Zeng, Ian Lane

In particular, we present a method to compute and use semantic acoustic features to perform city-identification and the features show semantic evidence of the identification.

Paper
Add Code

Environmental Noise Embeddings for Robust Speech Recognition

no code implementations • 11 Jan 2016 • Suyoun Kim, Bhiksha Raj, Ian Lane

We propose a novel deep neural network architecture for speech recognition that explicitly employs knowledge of the background environmental noise within a deep neural network acoustic model.

Management Multi-Task Learning +2

Paper
Add Code

Recurrent Models for Auditory Attention in Multi-Microphone Distance Speech Recognition

no code implementations • 19 Nov 2015 • Suyoun Kim, Ian Lane

Integration of multiple microphone data is one of the key ways to achieve robust speech recognition in noisy environments or when the speaker is located at some distance from the input device.

Robust Speech Recognition Speech Enhancement +1

Paper
Add Code

Deep Recurrent Neural Networks for Acoustic Modelling

no code implementations • 7 Apr 2015 • William Chan, Ian Lane

We present a novel deep Recurrent Neural Network (RNN) model for acoustic modelling in Automatic Speech Recognition (ASR).

Ranked #10 on Speech Recognition on WSJ eval92

Acoustic Modelling Automatic Speech Recognition +2

Paper
Add Code

Transferring Knowledge from a RNN to a DNN

no code implementations • 7 Apr 2015 • William Chan, Nan Rosemary Ke, Ian Lane

The small DNN trained on the soft RNN alignments achieved a 3. 93 WER on the Wall Street Journal (WSJ) eval92 task compared to a baseline 4. 54 WER or more than 13% relative improvement.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Situated Language Understanding at 25 Miles per Hour

no code implementations • WS 2014 • Teruhisa Misu, Antoine Raux, Rakesh Gupta, Ian Lane

Paper
Add Code

HRItk: The Human-Robot Interaction ToolKit Rapid Development of Speech-Centric Interactive Systems in ROS

no code implementations • WS 2012 • Ian Lane, Vinay Prasad, Gaurav Sinha, Arlette Umuhoza, Shangyu Luo, Ch, Akshay rashekaran, Antoine Raux

Gesture Recognition Object Recognition +2

Paper
Add Code

A Simulation-based Framework for Spoken Language Understanding and Action Selection in Situated Interaction

no code implementations • WS 2012 • David Cohen, Ian Lane

Spoken Language Understanding

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.