no code implementations • 22 May 2025 • Junjie Zheng, Zihao Chen, Chaofan Ding, Yunming Liang, Yihan Fan, Huan Yang, Lei Xie, Xinhan Di
Current movie dubbing technology can produce the desired speech using a reference voice and input video, maintaining perfect synchronization with the visuals while effectively conveying the intended emotions.
no code implementations • 31 Mar 2025 • Junjie Zheng, Zihao Chen, Chaofan Ding, Xinhan Di
First, it utilizes multimodal Chain-of-Thought (CoT) reasoning methods on visual inputs to understand dubbing styles and fine-grained attributes.
no code implementations • 28 Mar 2025 • Haomin Zhang, Chang Liu, Junjie Zheng, Zihao Chen, Chaofan Ding, Xinhan Di
However, in real-world scenarios, speech and audio often coexist in videos simultaneously, and the end-to-end generation of synchronous speech and audio given video and text conditions are not well studied.
no code implementations • 12 Dec 2024 • Zihao Chen, Haomin Zhang, Xinhan Di, Haoyu Wang, Sizhe Shan, Junjie Zheng, Yunming Liang, Yihan Fan, Xinfa Zhu, Wenjie Tian, Yihua Wang, Chaofan Ding, Lei Xie
Generating sound effects for product-level videos, where only a small amount of labeled data is available for diverse scenes, requires the production of high-quality sounds in few-shot settings.
no code implementations • 1 Aug 2024 • Xinhan Di, Zihao Chen, Yunming Liang, Junjie Zheng, Yihua Wang, Chaofan Ding
Large-scale text-to-speech (TTS) models have made significant progress recently. However, they still fall short in the generation of Chinese dialectal speech.
no code implementations • 27 Jun 2015 • Yi-Lun Wang, Zhiqiang Li, Yifeng Wang, Xiaona Wang, Junjie Zheng, Xujuan Duan, Huafu Chen
Feature selection is among the most important components because it not only helps enhance the classification accuracy, but also or even more important provides potential biomarker discovery.
no code implementations • 7 Jun 2015 • Yi-Lun Wang, Sheng Zhang, Junjie Zheng, Heng Chen, Huafu Chen
In this paper, we focus on how to locate the relevant or discriminative brain regions related with external stimulus or certain mental decease, which is also called support identification, based on the neuroimaging data.
no code implementations • 17 Oct 2014 • Yi-Lun Wang, Junjie Zheng, Sheng Zhang, Xujun Duan, Huafu Chen
In this paper, we consider voxel selection for functional Magnetic Resonance Imaging (fMRI) brain data with the aim of finding a more complete set of probably correlated discriminative voxels, thus improving interpretation of the discovered potential biomarkers.