no code implementations • 7 Jul 2024 • Qiqi He, Xuchen Song, Weituo Hao, Ju-Chiang Wang, Wei-Tsung Lu, Wei Li
For the case where the artist information is available, we extend the audio-based model to take multimodal inputs and develop a framework, called MultiModal Contrastive (MMC) learning, to enhance the training.
no code implementations • 8 Sep 2023 • Haoran Xiang, Junyu Dai, Xuchen Song, Furao Shen
The investigation of the similarity between artists and music is crucial in music retrieval and recommendation, and addressing the challenge of the long-tail phenomenon is increasingly important.
no code implementations • 28 Aug 2023 • Bing Han, Junyu Dai, Weituo Hao, Xinyan He, Dong Guo, Jitong Chen, Yuxuan Wang, Yanmin Qian, Xuchen Song
We tested InstructME in instrument-editing, remixing, and multi-round editing.
1 code implementation • 21 Dec 2022 • Zihao He, Weituo Hao, Wei-Tsung Lu, Changyou Chen, Kristina Lerman, Xuchen Song
Music captioning has gained significant attention in the wake of the rising prominence of streaming media platforms.
no code implementations • 18 Oct 2021 • Ju-Chiang Wang, Jordan B. L. Smith, Wei-Tsung Lu, Xuchen Song
Music structure analysis (MSA) methods traditionally search for musically meaningful patterns in audio: homogeneity, repetition, novelty, and segment-length regularity.
no code implementations • 26 Mar 2021 • Ju-Chiang Wang, Jordan B. L. Smith, Jitong Chen, Xuchen Song, Yuxuan Wang
This paper presents a novel supervised approach to detecting the chorus segments in popular music.
no code implementations • 26 Mar 2021 • Jiawen Huang, Ju-Chiang Wang, Jordan B. L. Smith, Xuchen Song, Yuxuan Wang
A music mashup combines audio elements from two or more songs to create a new work.
3 code implementations • 5 Oct 2020 • Qiuqiang Kong, Bochen Li, Xuchen Song, Yuan Wan, Yuxuan Wang
In addition, previous AMT systems are sensitive to the misaligned onset and offset labels of audio recordings.
Ranked #4 on Music Transcription on MAESTRO
Music Transcription Sound Audio and Speech Processing
no code implementations • 26 May 2020 • Dongyang Dai, Li Chen, Yu-Ping Wang, Mu Wang, Rui Xia, Xuchen Song, Zhiyong Wu, Yuxuan Wang
Firstly, the speech synthesis model is pre-trained with both multi-speaker clean data and noisy augmented data; then the pre-trained model is adapted on noisy low-resource new speaker data; finally, by setting the clean speech condition, the model can synthesize the new speaker's clean voice.