1 code implementation • 30 Nov 2023 • Jinhua Liang, Xubo Liu, Wenwu Wang, Mark D. Plumbley, Huy Phan, Emmanouil Benetos
Moreover, we improve the framework of audio language model by using interleaved audio-text embeddings as the input sequence.
1 code implementation • 26 Jul 2023 • Xubo Liu, Zhongkai Zhu, Haohe Liu, Yi Yuan, Meng Cui, Qiushi Huang, Jinhua Liang, Yin Cao, Qiuqiang Kong, Mark D. Plumbley, Wenwu Wang
Subjective evaluations demonstrate the potential of WavJourney in crafting engaging storytelling audio content from text.
no code implementations • 28 May 2023 • Jinhua Liang, Xubo Liu, Haohe Liu, Huy Phan, Emmanouil Benetos, Mark D. Plumbley, Wenwu Wang
We presented the Treff adapter, a training-efficient adapter for CLAP, to boost zero-shot classification performance by making use of a small set of labelled data.
no code implementations • 7 Mar 2023 • Yi Yuan, Haohe Liu, Jinhua Liang, Xubo Liu, Mark D. Plumbley, Wenwu Wang
Deep neural networks have recently achieved breakthroughs in sound generation with text prompts.
no code implementations • 2 Jul 2020 • Jinhua Liang, Tao Zhang, Guoqing Feng
Aiming at channel compression, a novel convolutional construction named compact convolution is proposed to embrace the progress in spatial convolution, channel grouping and pooling operation.