1 code implementation • 5 Feb 2024 • Jisheng Bai, Mou Wang, Haohe Liu, Han Yin, Yafei Jia, Siwei Huang, Yutong Du, Dongzhe Zhang, Dongyuan Shi, Woon-Seng Gan, Mark D. Plumbley, Susanto Rahardja, Bin Xiang, Jianfeng Chen
In addition, considering the abundance of unlabeled acoustic scene data in the real world, it is important to study the possible ways to utilize these unlabelled data.
no code implementations • 11 Jan 2024 • Han Yin, Mou Wang, Jisheng Bai, Dongyuan Shi, Woon-Seng Gan, Jianfeng Chen
This paper presents a detailed description of our proposed methods for the ICASSP 2024 Cadenza Challenge.
no code implementations • 23 Nov 2023 • Han Yin, Jisheng Bai, Mou Wang, Dongyuan Shi, Woon-Seng Gan, Jianfeng Chen
In this paper, we first propose an interactive dual-conformer (IDC) module, in which a cross-interaction mechanism is applied to effectively exploit the information from soft labels.
1 code implementation • 21 Nov 2023 • Jisheng Bai, Han Yin, Mou Wang, Dongyuan Shi, Woon-Seng Gan, Jianfeng Chen, Susanto Rahardja
This paper presents AudioLog, a large language models (LLMs)-powered audio logging system with hybrid token-semantic contrastive learning.
no code implementations • 8 Jun 2023 • Han Yin, Jisheng Bai, Mou Wang, Siwei Huang, Yafei Jia, Jianfeng Chen
3D speech enhancement can effectively improve the auditory experience and plays a crucial role in augmented reality technology.