Search Results for author: Muqiao Yang

Found 18 papers, 8 papers with code

Evaluating and Improving Continual Learning in Spoken Language Understanding

no code implementations • 16 Feb 2024 • Muqiao Yang, Xiang Li, Umberto Cappellazzo, Shinji Watanabe, Bhiksha Raj

In this work, we propose an evaluation methodology that provides a unified evaluation on stability, plasticity, and generalizability in continual learning.

Continual Learning Spoken Language Understanding

Paper
Add Code

Continual Contrastive Spoken Language Understanding

no code implementations • 4 Oct 2023 • Umberto Cappellazzo, Enrico Fini, Muqiao Yang, Daniele Falavigna, Alessio Brutti, Bhiksha Raj

In this paper, we investigate the problem of learning sequence-to-sequence models for spoken language understanding in a class-incremental learning (CIL) setting and we propose COCONUT, a CIL method that relies on the combination of experience replay and contrastive learning.

Class Incremental Learning Contrastive Learning +2

Paper
Add Code

uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models

no code implementations • 2 Oct 2023 • Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu

Speech enhancement aims to improve the quality of speech signals in terms of quality and intelligibility, and speech editing refers to the process of editing the speech according to specific user needs.

Denoising Self-Supervised Learning +2

Paper
Add Code

Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions

no code implementations • 16 Sep 2023 • Heming Wang, Meng Yu, Hao Zhang, Chunlei Zhang, Zhongweiyang Xu, Muqiao Yang, Yixuan Zhang, Dong Yu

Enhancing speech signal quality in adverse acoustic environments is a persistent challenge in speech processing.

Speech Enhancement

Paper
Add Code

Rethinking Voice-Face Correlation: A Geometry View

no code implementations • 26 Jul 2023 • Xiang Li, Yandong Wen, Muqiao Yang, Jinglu Wang, Rita Singh, Bhiksha Raj

Previous works on voice-face matching and voice-guided face synthesis demonstrate strong correlations between voice and face, but mainly rely on coarse semantic cues such as gender, age, and emotion.

3D Face Reconstruction Face Generation

Paper
Add Code

Sequence-Level Knowledge Distillation for Class-Incremental End-to-End Spoken Language Understanding

1 code implementation • 23 May 2023 • Umberto Cappellazzo, Muqiao Yang, Daniele Falavigna, Alessio Brutti

The ability to learn new concepts sequentially is a major weakness for modern neural networks, which hinders their use in non-stationary environments.

Continual Learning Knowledge Distillation +1

Paper
Code

Backdoor Attacks with Input-unique Triggers in NLP

no code implementations • 25 Mar 2023 • Xukun Zhou, Jiwei Li, Tianwei Zhang, Lingjuan Lyu, Muqiao Yang, Jun He

Backdoor attack aims at inducing neural models to make incorrect predictions for poison data while keeping predictions on the clean dataset unchanged, which creates a considerable threat to current natural language processing (NLP) systems.

Backdoor Attack Language Modelling +1

Paper
Add Code

PAAPLoss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement

2 code implementations • 16 Feb 2023 • Muqiao Yang, Joseph Konan, David Bick, Yunyang Zeng, Shuo Han, Anurag Kumar, Shinji Watanabe, Bhiksha Raj

We can add this criterion as an auxiliary loss to any model that produces speech, to optimize speech outputs to match the values of clean speech in these features.

Speech Enhancement Time Series +1

Paper
Code

TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement

2 code implementations • 16 Feb 2023 • Yunyang Zeng, Joseph Konan, Shuo Han, David Bick, Muqiao Yang, Anurag Kumar, Shinji Watanabe, Bhiksha Raj

We propose an objective for perceptual quality based on temporal acoustic parameters.

Speaker Recognition Speech Enhancement

Paper
Code

Simulating realistic speech overlaps improves multi-talker ASR

no code implementations • 27 Oct 2022 • Muqiao Yang, Naoyuki Kanda, Xiaofei Wang, Jian Wu, Sunit Sivasankaran, Zhuo Chen, Jinyu Li, Takuya Yoshioka

Multi-talker automatic speech recognition (ASR) has been studied to generate transcriptions of natural conversation including overlapping speech of multiple speakers.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Online Continual Learning of End-to-End Speech Recognition Models

no code implementations • 11 Jul 2022 • Muqiao Yang, Ian Lane, Shinji Watanabe

Continual Learning, also known as Lifelong Learning, aims to continually learn from new data as it becomes available.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Improving Speech Enhancement through Fine-Grained Speech Characteristics

1 code implementation • 1 Jul 2022 • Muqiao Yang, Joseph Konan, David Bick, Anurag Kumar, Shinji Watanabe, Bhiksha Raj

We first identify key acoustic parameters that have been found to correlate well with voice quality (e. g. jitter, shimmer, and spectral flux) and then propose objective functions which are aimed at reducing the difference between clean speech and enhanced speech with respect to these features.

Speech Enhancement

Paper
Code

M2Lens: Visualizing and Explaining Multimodal Models for Sentiment Analysis

no code implementations • 17 Jul 2021 • Xingbo Wang, Jianben He, Zhihua Jin, Muqiao Yang, Yong Wang, Huamin Qu

Much research focuses on modeling the complex intra- and inter-modal interactions between different communication channels.

Multimodal Sentiment Analysis

Paper
Add Code

Signal Transformer: Complex-valued Attention and Meta-Learning for Signal Recognition

no code implementations • 5 Jun 2021 • Yihong Dong, Ying Peng, Muqiao Yang, Songtao Lu, Qingjiang Shi

Deep neural networks have been shown as a class of useful tools for addressing signal recognition issues in recent years, especially for identifying the nonlinear feature structures of signals.

Meta-Learning Time Series +1

Paper
Add Code

Self-supervised Representation Learning with Relative Predictive Coding

1 code implementation • ICLR 2021 • Yao-Hung Hubert Tsai, Martin Q. Ma, Muqiao Yang, Han Zhao, Louis-Philippe Morency, Ruslan Salakhutdinov

This paper introduces Relative Predictive Coding (RPC), a new contrastive representation learning objective that maintains a good balance among training stability, minibatch size sensitivity, and downstream task performance.

Representation Learning Self-Supervised Learning

Paper
Code

Improving Lesion Segmentation for Diabetic Retinopathy using Adversarial Learning

1 code implementation • 27 Jul 2020 • Qiqi Xiao, Jiaxu Zou, Muqiao Yang, Alex Gaudio, Kris Kitani, Asim Smailagic, Pedro Costa, Min Xu

Diabetic Retinopathy (DR) is a leading cause of blindness in working age adults.

Generative Adversarial Network Lesion Segmentation +2

Paper
Code

Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis

1 code implementation • EMNLP 2020 • Yao-Hung Hubert Tsai, Martin Q. Ma, Muqiao Yang, Ruslan Salakhutdinov, Louis-Philippe Morency

The human language can be expressed through multiple sources of information known as modalities, including tones of voice, facial gestures, and spoken language.

Emotion Recognition Sentiment Analysis

Paper
Code

Complex Transformer: A Framework for Modeling Complex-Valued Sequence

1 code implementation • 22 Oct 2019 • Muqiao Yang, Martin Q. Ma, Dongyu Li, Yao-Hung Hubert Tsai, Ruslan Salakhutdinov

While deep learning has received a surge of interest in a variety of fields in recent years, major deep learning models barely use complex numbers.

Ranked #2 on Music Transcription on MusicNet

Music Transcription

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.