3 code implementations • 29 Sep 2023 • Xiang Li, Jinglu Wang, Xiaohao Xu, Xiulian Peng, Rita Singh, Yan Lu, Bhiksha Raj
We propose a semantic decomposition method based on product quantization, where the multi-source semantics can be decomposed and represented by several disentangled and noise-suppressed single-source semantics.
no code implementations • 26 May 2023 • Yixin Wan, Yuan Zhou, Xiulian Peng, Kai-Wei Chang, Yan Lu
To begin with, we are among the first to comprehensively investigate mainstream KD techniques on DNS models to resolve the two challenges.
no code implementations • 25 Feb 2023 • Chengyu Zheng, Yuan Zhou, Xiulian Peng, Yuan Zhang, Yan Lu
Time-variant factors often occur in real-world full-duplex communication applications.
no code implementations • 21 Feb 2023 • Chengyu Zheng, Yuan Zhou, Xiulian Peng, Yuan Zhang, Yan Lu
For real-time speech enhancement (SE) including noise suppression, dereverberation and acoustic echo cancellation, the time-variance of the audio signals becomes a severe challenge.
no code implementations • 22 Nov 2022 • Xue Jiang, Xiulian Peng, Yuan Zhang, Yan Lu
Recently end-to-end neural audio/speech coding has shown its great potential to outperform traditional signal analysis based audio codecs.
no code implementations • 18 Jul 2022 • Xue Jiang, Xiulian Peng, Huaying Xue, Yuan Zhang, Yan Lu
Neural audio/speech coding has recently demonstrated its capability to deliver high quality at much lower bitrates than traditional methods.
no code implementations • 7 Jul 2022 • Xue Jiang, Xiulian Peng, Huaying Xue, Yuan Zhang, Yan Lu
In this paper, we introduce a cross-scale scalable vector quantization scheme (CSVQ), in which multi-scale features are encoded progressively with stepwise feature fusion and refinement.
no code implementations • 4 Jul 2022 • Xiaoyu Wang, Xiangyu Kong, Xiulian Peng, Yan Lu
In this paper we propose a multi-modal multi-correlation learning framework targeting at the task of audio-visual speech separation.
no code implementations • 24 Jan 2022 • Xue Jiang, Xiulian Peng, Chengyu Zheng, Huaying Xue, Yuan Zhang, Yan Lu
Deep-learning based methods have shown their advantages in audio coding over traditional ones but limited attention has been paid on real-time communications (RTC).
no code implementations • 8 Apr 2021 • Yajing Liu, Xiulian Peng, Zhiwei Xiong, Yan Lu
Specifically, we propose a phoneme-based distribution regularization (PbDr) for speech enhancement, which incorporates frame-wise phoneme information into speech enhancement network in a conditional manner.
no code implementations • 17 Dec 2020 • Chengyu Zheng, Xiulian Peng, Yuan Zhang, Sriram Srinivasan, Yan Lu
In this paper, we propose a novel idea to model speech and noise simultaneously in a two-branch convolutional neural network, namely SN-Net.
Ranked #1 on Speech Enhancement on Deep Noise Suppression (DNS) Challenge (SI-SDR-NB metric)
no code implementations • NeurIPS 2018 • Zhenhua Liu, Jizheng Xu, Xiulian Peng, Ruiqin Xiong
Deep convolutional neural networks have demonstrated their powerfulness in a variety of applications.
1 code implementation • ICCV 2017 • Boyi Li, Xiulian Peng, Zhangyang Wang, Jizheng Xu, Dan Feng
This paper proposes an image dehazing model built with a convolutional neural network (CNN), called All-in-One Dehazing Network (AOD-Net).
Ranked #20 on Image Dehazing on SOTS Outdoor
no code implementations • 12 Sep 2017 • Boyi Li, Xiulian Peng, Zhangyang Wang, Jizheng Xu, Dan Feng
Furthermore, we build an End-to-End United Video Dehazing and Detection Network(EVDD-Net), which concatenates and jointly trains EVD-Net with a video object detection model.
2 code implementations • 20 Jul 2017 • Boyi Li, Xiulian Peng, Zhangyang Wang, Jizheng Xu, Dan Feng
This paper proposes an image dehazing model built with a convolutional neural network (CNN), called All-in-One Dehazing Network (AOD-Net).