WavLLM: Towards Robust and Adaptive Speech Large Language Model

no code implementations31 Mar 2024 Shujie Hu, Long Zhou, Shujie Liu, Sanyuan Chen, Hongkun Hao, Jing Pan, Xunying Liu, Jinyu Li, Sunit Sivasankaran, Linquan Liu, Furu Wei

In this work, we introduce WavLLM, a robust and adaptive speech large language model with dual encoders, and a prompt-aware LoRA weight adapter, optimized by a two-stage curriculum learning approach.

Language Modelling Large Language Model

COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning

no code implementations3 Nov 2023 Jing Pan, Jian Wu, Yashesh Gaur, Sunit Sivasankaran, Zhuo Chen, Shujie Liu, Jinyu Li

With fewer than 20M trainable parameters and as little as 450 hours of English speech data for SQA generation, COSMIC exhibits emergent instruction-following and in-context learning capabilities in speech-to-text tasks.

Domain Adaptation In-Context Learning +4

E-Branchformer: Branchformer with Enhanced merging for speech recognition

1 code implementation30 Sep 2022 Kwangyoun Kim, Felix Wu, Yifan Peng, Jing Pan, Prashant Sridhar, Kyu J. Han, Shinji Watanabe

Conformer, combining convolution and self-attention sequentially to capture both local and global information, has shown remarkable performance and is currently regarded as the state-of-the-art for automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition

no code implementations11 Oct 2021 Jing Pan, Tao Lei, Kwangyoun Kim, Kyu Han, Shinji Watanabe

The Transformer architecture has been well adopted as a dominant architecture in most sequence transduction tasks including automatic speech recognition (ASR), since its attention mechanism excels in capturing long-range dependencies.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Sensoring and Application of Multimodal Data for the Detection of Freezing of Gait in Parkinson's Disease

no code implementations9 Oct 2021 Wei zhang, Debin Huang, Hantao Li, Lipeng Wang, Yanzhao Wei, Kang Pan, Lin Ma, Huanhuan Feng, Jing Pan, Yuzhu Guo

The accurate and reliable detection or prediction of freezing of gaits (FOG) is important for fall prevention in Parkinson's Disease (PD) and studying the physiological transitions during the occurrence of FOG.

EEG valid

Leveraging Pre-trained Language Model for Speech Sentiment Analysis

no code implementations11 Jun 2021 Suwon Shon, Pablo Brusco, Jing Pan, Kyu J. Han, Shinji Watanabe

In this paper, we explore the use of pre-trained language models to learn sentiment information of written texts for speech sentiment analysis.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Multistream CNN for Robust Acoustic Modeling

no code implementations21 May 2020 Kyu J. Han, Jing Pan, Venkata Krishna Naveen Tadala, Tao Ma, Dan Povey

When combined with self-attentive SRU LM rescoring, multistream CNN contributes for ASAPP to achieve the best WER of 1. 75% on test-clean in LibriSpeech.

Data Augmentation speech-recognition +1

ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition

no code implementations21 May 2020 Jing Pan, Joshua Shapiro, Jeremy Wohlwend, Kyu J. Han, Tao Lei, Tao Ma

In this paper we present state-of-the-art (SOTA) performance on the LibriSpeech corpus with two novel neural network architectures, a multistream CNN for acoustic modeling and a self-attentive simple recurrent unit (SRU) for language modeling.

Data Augmentation Language Modelling +2

Benchmark Tests of Convolutional Neural Network and Graph Convolutional Network on HorovodRunner Enabled Spark Clusters

1 code implementation12 May 2020 Jing Pan, Wendao Liu, Jing Zhou

The freedom of fast iterations of distributed deep learning tasks is crucial for smaller companies to gain competitive advantages and market shares from big tech giants.

Adversarial Validation Approach to Concept Drift Problem in User Targeting Automation Systems at Uber

no code implementations7 Apr 2020 Jing Pan, Vincent Pham, Mohan Dorairaj, Huigang Chen, Jeong-Yoon Lee

Here, we introduce an adversarial validation approach to concept drift problems in user targeting automation systems.

Order Matters at Fanatics Recommending Sequentially Ordered Products by LSTM Embedded with Word2Vec

no code implementations22 Nov 2019 Jing Pan, Weian Sheng, Santanu Dey

A unique challenge for e-commerce recommendation is that customers are often interested in products that are more advanced than their already purchased products, but not reversed.

Recommendation Systems

Moving Object Detection in Video Using Saliency Map and Subspace Learning

no code implementations30 Sep 2015 Yanwei Pang, Li Ye, Xuelong. Li, Jing Pan

So there are undesirable false alarms and missed alarms in many algorithms of moving object detection.

Moving Object Detection object-detection

