no code implementations • 1 Apr 2025 • Shuyu Li, Shulei Ji, ZiHao Wang, Songruoyao Wu, Jiaxing Yu, Kejun Zhang
Multi-modal music generation, using multiple modalities like text, images, and video alongside musical scores and audio as guidance, is an emerging research area with broad applications.
no code implementations • 18 Feb 2025 • Shulei Ji, Songruoyao Wu, ZiHao Wang, Shuyu Li, Kejun Zhang
The burgeoning growth of video-to-music generation can be attributed to the ascendancy of multimodal generative models.
no code implementations • 24 Dec 2024 • Jiaxing Yu, Xinda Wu, Yunfei Xu, Tieyao Zhang, Songruoyao Wu, Le Ma, Kejun Zhang
In this paper, we propose SongGLM, a lyric-to-melody generation system that leverages 2D alignment encoding and multi-task pre-training based on the General Language Model (GLM) to guarantee the alignment and harmony between lyrics and melodies.
1 code implementation • Information Security Conference 2024 • Kejun Zhang, Yutuo Song, Shaofei Xu, Pengcheng Li, Rong Qian, Pengzhi Han, Lingyun Xu
We propose a framework for backdoor attacks in the context of text dataset distillation, termed Text Backdoor Attack under Dataset Distillation (TBADD).
no code implementations • 5 Sep 2024 • Haoxuan Liu, ZiHao Wang, HaoRong Hong, Youwei Feng, Jiaxin Yu, Han Diao, Yunfei Xu, Kejun Zhang
This paper introduces MetaBGM, a groundbreaking framework for generating background music that adapts to dynamic scenes and real-time user interactions.
1 code implementation • 10 Jul 2024 • ZiHao Wang, Le Ma, Yongsheng Feng, Xin Pan, Yuhang Jin, Kejun Zhang
Singing voice conversion (SVC) aims to convert a singer's voice to another singer's from a reference audio while keeping the original semantics.
no code implementations • 3 Jul 2024 • ZiHao Wang, Haoxuan Liu, Jiaxing Yu, Tao Zhang, Yan Liu, Kejun Zhang
This task is aimed at bridging the gap between colloquial language understanding and auditory expression within an AI model, with the ultimate goal of creating songs that accurately satisfy human auditory expectations and structurally align with musical norms.
1 code implementation • 15 Feb 2024 • ZiHao Wang, Shuyu Li, Tao Zhang, Qi Wang, Pengfei Yu, Jinyang Luo, Yan Liu, Ming Xi, Kejun Zhang
To this end, we present MuChin, the first open-source music description benchmark in Chinese colloquial language, designed to evaluate the performance of multimodal LLMs in understanding and describing music.
2 code implementations • 11 Jan 2024 • Yue Liu, Shihao Zhu, Jun Xia, Yingwei Ma, Jian Ma, Xinwang Liu, Shengju Yu, Kejun Zhang, Wenliang Zhong
Concretely, we encode user behavior sequences and initialize the cluster centers (latent intents) as learnable neurons.
1 code implementation • 19 Sep 2023 • Xinda Wu, Zhijie Huang, Kejun Zhang, Jiaxing Yu, Xu Tan, Tieyao Zhang, ZiHao Wang, Lingyun Sun
In particular, subjective evaluations show that, on the melody continuation task, MelodyGLM gains average improvements of 0. 82, 0. 87, 0. 78, and 0. 94 in consistency, rhythmicity, structure, and overall quality, respectively.
1 code implementation • 14 May 2023 • ZiHao Wang, Le Ma, Chen Zhang, Bo Han, Yunfei Xu, Yikai Wang, Xinyi Chen, HaoRong Hong, Wenbo Liu, Xinda Wu, Kejun Zhang
Music as an emotional intervention medium has important applications in scenarios such as music therapy, games, and movies.
1 code implementation • 11 Jan 2023 • Kejun Zhang, Xinda Wu, Tieyao Zhang, Zhijie Huang, Xu Tan, Qihao Liang, Songruoyao Wu, Lingyun Sun
Although deep learning has revolutionized music generation, existing methods for structured melody generation follow an end-to-end left-to-right note-by-note generative paradigm and treat each note equally.
no code implementations • 13 Sep 2022 • ZiHao Wang, Qihao Liang, Kejun Zhang, Yuxing Wang, Chen Zhang, Pengfei Yu, Yongsheng Feng, Wenbo Liu, Yikai Wang, Yuntai Bao, Yiheng Yang
In this paper, we propose SongDriver, a real-time music accompaniment generation system without logical latency nor exposure bias.
no code implementations • Findings (ACL) 2022 • Fenfei Guo, Chen Zhang, Zhirui Zhang, Qixin He, Kejun Zhang, Jun Xie, Jordan Boyd-Graber
This paper develops automatic song translation (AST) for tonal languages and addresses the unique challenge of aligning words' tones with melody of a song in addition to conveying the original meaning.
1 code implementation • 21 Feb 2022 • Hang Zhao, Chen Zhang, Belei Zhu, Zejun Ma, Kejun Zhang
To our knowledge, S3T is the first method combining the Swin Transformer with a self-supervised learning method for music classification.
1 code implementation • 20 Sep 2021 • Zeqian Ju, Peiling Lu, Xu Tan, Rui Wang, Chen Zhang, Songruoyao Wu, Kejun Zhang, Xiangyang Li, Tao Qin, Tie-Yan Liu
In this paper, we develop TeleMelody, a two-stage lyric-to-melody generation system with music template (e. g., tonality, chord progression, rhythm pattern, and cadence) to bridge the gap between lyrics and melodies (i. e., the system consists of a lyric-to-template module and a template-to-melody module).
no code implementations • 16 Sep 2021 • Chen Zhang, Jiaxing Yu, LuChin Chang, Xu Tan, Jiawei Chen, Tao Qin, Kejun Zhang
Considering that there is a large amount of ASR training data, a straightforward method is to leverage ASR data to enhance ALT training.
Automatic Lyrics Transcription
Automatic Speech Recognition
+3
no code implementations • 17 Dec 2020 • Chen Zhang, Yi Ren, Xu Tan, Jinglin Liu, Kejun Zhang, Tao Qin, Sheng Zhao, Tie-Yan Liu
In DenoiSpeech, we handle real-world noisy speech by modeling the fine-grained frame-level noise with a noise condition module, which is jointly trained with the TTS model.
1 code implementation • 22 Aug 2020 • Guanghao Yin, Shou-qian Sun, Dian Yu, Dejian Li, Kejun Zhang
In this paper, our work makes an attempt to fuse the subject individual EDA features and the external evoked music features.