Search Results for author: Junjie Zheng

Found 8 papers, 0 papers with code

MM-MovieDubber: Towards Multi-Modal Learning for Multi-Modal Movie Dubbing

no code implementations22 May 2025 Junjie Zheng, Zihao Chen, Chaofan Ding, Yunming Liang, Yihan Fan, Huan Yang, Lei Xie, Xinhan Di

Current movie dubbing technology can produce the desired speech using a reference voice and input video, maintaining perfect synchronization with the visuals while effectively conveying the intended emotions.

Language Modeling Language Modelling

DeepDubber-V1: Towards High Quality and Dialogue, Narration, Monologue Adaptive Movie Dubbing Via Multi-Modal Chain-of-Thoughts Reasoning Guidance

no code implementations31 Mar 2025 Junjie Zheng, Zihao Chen, Chaofan Ding, Xinhan Di

First, it utilizes multimodal Chain-of-Thought (CoT) reasoning methods on visual inputs to understand dubbing styles and fine-grained attributes.

Large Language Model

DeepAudio-V1:Towards Multi-Modal Multi-Stage End-to-End Video to Speech and Audio Generation

no code implementations28 Mar 2025 Haomin Zhang, Chang Liu, Junjie Zheng, Zihao Chen, Chaofan Ding, Xinhan Di

However, in real-world scenarios, speech and audio often coexist in videos simultaneously, and the end-to-end generation of synchronous speech and audio given video and text conditions are not well studied.

Audio Generation Audio-Visual Synchronization +2

YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls

no code implementations12 Dec 2024 Zihao Chen, Haomin Zhang, Xinhan Di, Haoyu Wang, Sizhe Shan, Junjie Zheng, Yunming Liang, Yihan Fan, Xinfa Zhu, Wenjie Tian, Yihua Wang, Chaofan Ding, Lei Xie

Generating sound effects for product-level videos, where only a small amount of labeled data is available for diverse scenes, requires the production of high-quality sounds in few-shot settings.

Audio Generation

Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation

no code implementations1 Aug 2024 Xinhan Di, Zihao Chen, Yunming Liang, Junjie Zheng, Yihua Wang, Chaofan Ding

Large-scale text-to-speech (TTS) models have made significant progress recently. However, they still fall short in the generation of Chinese dialectal speech.

Representation Learning Speech Synthesis +2

A Novel Approach for Stable Selection of Informative Redundant Features from High Dimensional fMRI Data

no code implementations27 Jun 2015 Yi-Lun Wang, Zhiqiang Li, Yifeng Wang, Xiaona Wang, Junjie Zheng, Xujuan Duan, Huafu Chen

Feature selection is among the most important components because it not only helps enhance the classification accuracy, but also or even more important provides potential biomarker discovery.

feature selection

Randomized Structural Sparsity based Support Identification with Applications to Locating Activated or Discriminative Brain Areas: A Multi-center Reproducibility Study

no code implementations7 Jun 2015 Yi-Lun Wang, Sheng Zhang, Junjie Zheng, Heng Chen, Huafu Chen

In this paper, we focus on how to locate the relevant or discriminative brain regions related with external stimulus or certain mental decease, which is also called support identification, based on the neuroimaging data.

feature selection

Randomized Structural Sparsity via Constrained Block Subsampling for Improved Sensitivity of Discriminative Voxel Identification

no code implementations17 Oct 2014 Yi-Lun Wang, Junjie Zheng, Sheng Zhang, Xujun Duan, Huafu Chen

In this paper, we consider voxel selection for functional Magnetic Resonance Imaging (fMRI) brain data with the aim of finding a more complete set of probably correlated discriminative voxels, thus improving interpretation of the discovered potential biomarkers.

feature selection

Cannot find the paper you are looking for? You can Submit a new open access paper.