no code implementations • 12 Apr 2025 • Moyang Liu, Kaiying Yan, Yukun Liu, Ruibo Fu, Zhengqi Wen, Xuefei Liu, Chenxing Li
The rapid growth of social media has led to the widespread dissemination of fake news across multiple content forms, including text, images, audio, and video.
no code implementations • 12 Apr 2025 • Moyang Liu, Kaiying Yan, Yukun Liu, Ruibo Fu, Zhengqi Wen, Xuefei Liu, Chenxing Li
Compared to unimodal fake news detection, multimodal fake news detection benefits from the increased availability of information across multiple modalities.
1 code implementation • 13 Jan 2025 • Junhao Zheng, Chengming Shi, Xidi Cai, Qiuke Li, Duzhen Zhang, Chenxing Li, Dong Yu, Qianli Ma
This survey is the first to systematically summarize the potential techniques for incorporating lifelong learning into LLM-based agents.
1 code implementation • 11 Jan 2025 • Yuankun Xie, Xiaopeng Wang, Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Songjun Cao, Long Ma, Chenxing Li, Haonnan Cheng, Long Ye
Current research in audio deepfake detection is gradually transitioning from binary classification to multi-class tasks, referred as audio deepfake source tracing task.
no code implementations • 24 Nov 2024 • Haojie Zhang, Zhihao Liang, Ruibo Fu, Zhengqi Wen, Xuefei Liu, Chenxing Li, JianHua Tao, Yaling Liang
Then we propose a suitable solution according to the modality differences of image, audio, and video generation.
1 code implementation • 18 Nov 2024 • Duzhen Zhang, Yahan Yu, Chenxing Li, Jiahua Dong, Dong Yu
In a more realistic scenario, local clients receive new entity types continuously, while new local clients collecting novel data may irregularly join the global FNER training.
no code implementations • 23 Sep 2024 • Yuchen Hu, Yu Gu, Chenxing Li, Rilin Chen, Dong Yu
With recent advances of AIGC, video generation have gained a surge of research interest in both academia and industry (e. g., Sora).
no code implementations • 18 Sep 2024 • Xin Qi, Ruibo Fu, Zhengqi Wen, Tao Wang, Chunyu Qiang, JianHua Tao, Chenxing Li, Yi Lu, Shuchen Shi, Zhiyong Wang, Xiaopeng Wang, Yuankun Xie, Yukun Liu, Xuefei Liu, Guanjun Li
In recent years, speech diffusion models have advanced rapidly.
no code implementations • 17 Sep 2024 • Jiarui Hai, Yong Xu, Hao Zhang, Chenxing Li, Helin Wang, Mounya Elhilali, Dong Yu
Latent diffusion models have shown promising results in text-to-audio (T2A) generation tasks, yet previous models have encountered difficulties in generation quality, computational cost, diffusion sampling, and data preparation.
no code implementations • 14 Sep 2024 • Manjie Xu, Chenxing Li, Xinyi Tu, Yong Ren, Ruibo Fu, Wei Liang, Dong Yu
We introduce Diffusion-based Audio Captioning (DAC), a non-autoregressive diffusion model tailored for diverse and efficient audio captioning.
no code implementations • 14 Sep 2024 • Chenxu Xiong, Ruibo Fu, Shuchen Shi, Zhengqi Wen, JianHua Tao, Tao Wang, Chenxing Li, Chunyu Qiang, Yuankun Xie, Xin Qi, Guanjun Li, Zizheng Yang
Additionally, the Sound Event Reference Style Transfer Dataset (SERST) is introduced for the proposed target style audio generation task, enabling dual-prompt audio generation using both text and audio references.
no code implementations • 10 Jul 2024 • Manjie Xu, Chenxing Li, Xinyi Tu, Yong Ren, Rilin Chen, Yu Gu, Wei Liang, Dong Yu
In this work, we aim to offer insights into the video-to-audio generation paradigm, focusing on three crucial aspects: vision encoders, auxiliary embeddings, and data augmentation techniques.
1 code implementation • 17 Jun 2024 • Di Wang, Meiqi Hu, Yao Jin, Yuchun Miao, Jiaqi Yang, Yichu Xu, Xiaolei Qin, Jiaqi Ma, Lingyu Sun, Chenxing Li, Chuan Fu, Hongruixuan Chen, Chengxi Han, Naoto Yokoya, Jing Zhang, Minqiang Xu, Lin Liu, Lefei Zhang, Chen Wu, Bo Du, DaCheng Tao, Liangpei Zhang
Accurate hyperspectral image (HSI) interpretation is critical for providing valuable insights into various earth observation-related applications such as urban planning, precision agriculture, and environmental monitoring.
no code implementations • 11 May 2024 • Manjie Xu, Chenxing Li, Duzhen Zhang, Dan Su, Wei Liang, Dong Yu
Audio editing involves the arbitrary manipulation of audio content through precise control.
no code implementations • 24 Jan 2024 • Duzhen Zhang, Yahan Yu, Jiahua Dong, Chenxing Li, Dan Su, Chenhui Chu, Dong Yu
In the past year, MultiModal Large Language Models (MM-LLMs) have undergone substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs via cost-effective training strategies.
1 code implementation • 14 Jul 2021 • Lu Zhang, Chenxing Li, Feng Deng, Xiaorui Wang
In detail, the proposed model follows a two-stage pipeline, which separates the three types of audio signals and then performs signal compensation separately.
Ranked #1 on
Multi-task Audio Source Seperation
on MTASS
Audio Source Separation
Multi-task Audio Source Seperation
+3
no code implementations • 25 Jun 2018 • Chenxing Li, Tieqiang Wang, Shuang Xu, Bo Xu
In this paper, we propose a single-channel speech dereverberation system (DeReGAT) based on convolutional, bidirectional long short-term memory and deep feed-forward neural network (CBLDNN) with generative adversarial training (GAT).
1 code implementation • 10 May 2018 • Chenxing Li, Peilun Li, Dong Zhou, Wei Xu, Fan Long
The Conflux consensus protocol represents relationships between blocks as a direct acyclic graph and achieves consensus on a total order of the blocks.
Distributed, Parallel, and Cluster Computing