no code implementations • 16 Feb 2024 • Muqiao Yang, Xiang Li, Umberto Cappellazzo, Shinji Watanabe, Bhiksha Raj
In this work, we propose an evaluation methodology that provides a unified evaluation on stability, plasticity, and generalizability in continual learning.
no code implementations • 4 Oct 2023 • Umberto Cappellazzo, Enrico Fini, Muqiao Yang, Daniele Falavigna, Alessio Brutti, Bhiksha Raj
In this paper, we investigate the problem of learning sequence-to-sequence models for spoken language understanding in a class-incremental learning (CIL) setting and we propose COCONUT, a CIL method that relies on the combination of experience replay and contrastive learning.
no code implementations • 2 Oct 2023 • Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu
Speech enhancement aims to improve the quality of speech signals in terms of quality and intelligibility, and speech editing refers to the process of editing the speech according to specific user needs.
no code implementations • 16 Sep 2023 • Heming Wang, Meng Yu, Hao Zhang, Chunlei Zhang, Zhongweiyang Xu, Muqiao Yang, Yixuan Zhang, Dong Yu
Enhancing speech signal quality in adverse acoustic environments is a persistent challenge in speech processing.
no code implementations • 26 Jul 2023 • Xiang Li, Yandong Wen, Muqiao Yang, Jinglu Wang, Rita Singh, Bhiksha Raj
Previous works on voice-face matching and voice-guided face synthesis demonstrate strong correlations between voice and face, but mainly rely on coarse semantic cues such as gender, age, and emotion.
1 code implementation • 23 May 2023 • Umberto Cappellazzo, Muqiao Yang, Daniele Falavigna, Alessio Brutti
The ability to learn new concepts sequentially is a major weakness for modern neural networks, which hinders their use in non-stationary environments.
no code implementations • 25 Mar 2023 • Xukun Zhou, Jiwei Li, Tianwei Zhang, Lingjuan Lyu, Muqiao Yang, Jun He
Backdoor attack aims at inducing neural models to make incorrect predictions for poison data while keeping predictions on the clean dataset unchanged, which creates a considerable threat to current natural language processing (NLP) systems.
2 code implementations • 16 Feb 2023 • Muqiao Yang, Joseph Konan, David Bick, Yunyang Zeng, Shuo Han, Anurag Kumar, Shinji Watanabe, Bhiksha Raj
We can add this criterion as an auxiliary loss to any model that produces speech, to optimize speech outputs to match the values of clean speech in these features.
2 code implementations • 16 Feb 2023 • Yunyang Zeng, Joseph Konan, Shuo Han, David Bick, Muqiao Yang, Anurag Kumar, Shinji Watanabe, Bhiksha Raj
We propose an objective for perceptual quality based on temporal acoustic parameters.
no code implementations • 27 Oct 2022 • Muqiao Yang, Naoyuki Kanda, Xiaofei Wang, Jian Wu, Sunit Sivasankaran, Zhuo Chen, Jinyu Li, Takuya Yoshioka
Multi-talker automatic speech recognition (ASR) has been studied to generate transcriptions of natural conversation including overlapping speech of multiple speakers.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 11 Jul 2022 • Muqiao Yang, Ian Lane, Shinji Watanabe
Continual Learning, also known as Lifelong Learning, aims to continually learn from new data as it becomes available.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 1 Jul 2022 • Muqiao Yang, Joseph Konan, David Bick, Anurag Kumar, Shinji Watanabe, Bhiksha Raj
We first identify key acoustic parameters that have been found to correlate well with voice quality (e. g. jitter, shimmer, and spectral flux) and then propose objective functions which are aimed at reducing the difference between clean speech and enhanced speech with respect to these features.
no code implementations • 17 Jul 2021 • Xingbo Wang, Jianben He, Zhihua Jin, Muqiao Yang, Yong Wang, Huamin Qu
Much research focuses on modeling the complex intra- and inter-modal interactions between different communication channels.
no code implementations • 5 Jun 2021 • Yihong Dong, Ying Peng, Muqiao Yang, Songtao Lu, Qingjiang Shi
Deep neural networks have been shown as a class of useful tools for addressing signal recognition issues in recent years, especially for identifying the nonlinear feature structures of signals.
1 code implementation • ICLR 2021 • Yao-Hung Hubert Tsai, Martin Q. Ma, Muqiao Yang, Han Zhao, Louis-Philippe Morency, Ruslan Salakhutdinov
This paper introduces Relative Predictive Coding (RPC), a new contrastive representation learning objective that maintains a good balance among training stability, minibatch size sensitivity, and downstream task performance.
1 code implementation • 27 Jul 2020 • Qiqi Xiao, Jiaxu Zou, Muqiao Yang, Alex Gaudio, Kris Kitani, Asim Smailagic, Pedro Costa, Min Xu
Diabetic Retinopathy (DR) is a leading cause of blindness in working age adults.
1 code implementation • EMNLP 2020 • Yao-Hung Hubert Tsai, Martin Q. Ma, Muqiao Yang, Ruslan Salakhutdinov, Louis-Philippe Morency
The human language can be expressed through multiple sources of information known as modalities, including tones of voice, facial gestures, and spoken language.
1 code implementation • 22 Oct 2019 • Muqiao Yang, Martin Q. Ma, Dongyu Li, Yao-Hung Hubert Tsai, Ruslan Salakhutdinov
While deep learning has received a surge of interest in a variety of fields in recent years, major deep learning models barely use complex numbers.
Ranked #2 on Music Transcription on MusicNet