Search Results for author: Luchuan Song

Found 17 papers, 9 papers with code

GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling

1 code implementation31 Jan 2025 Pinxin Liu, Luchuan Song, Junhua Huang, Haiyang Liu, Chenliang Xu

To overcome the suboptimal performance of flow matching baseline, we propose latent shortcut learning and beta distribution time stamp sampling during training to enhance gesture synthesis quality and accelerate inference.

Denoising Gesture Generation

Generative AI for Cel-Animation: A Survey

1 code implementation8 Jan 2025 Yunlong Tang, Junjia Guo, Pinxin Liu, Zhiyuan Wang, Hang Hua, Jia-Xing Zhong, Yunzhong Xiao, Chao Huang, Luchuan Song, Susan Liang, Yizhi Song, Liu He, Jing Bi, Mingqian Feng, Xinyang Li, Zeliang Zhang, Chenliang Xu

Traditional Celluloid (Cel) Animation production pipeline encompasses multiple essential steps, including storyboarding, layout design, keyframe animation, inbetweening, and colorization, which demand substantial manual effort, technical expertise, and significant time investment.

Colorization Layout Design +1

Free-viewpoint Human Animation with Pose-correlated Reference Selection

no code implementations23 Dec 2024 Fa-Ting Hong, Zhan Xu, Haiyang Liu, Qinjie Lin, Luchuan Song, Zhixin Shu, Yang Zhou, Duygu Ceylan, Dan Xu

Diffusion-based human animation aims to animate a human character based on a source human image as well as driving signals such as a sequence of poses.

Human Animation

EAGLE: Egocentric AGgregated Language-video Engine

no code implementations26 Sep 2024 Jing Bi, Yunlong Tang, Luchuan Song, Ali Vosoughi, Nguyen Nguyen, Chenliang Xu

The rapid evolution of egocentric video analysis brings new insights into understanding human activities and intentions from a first-person perspective.

Action Recognition Language Modeling +6

TextToon: Real-Time Text Toonify Head Avatar from Single Video

no code implementations23 Sep 2024 Luchuan Song, Lele Chen, Celong Liu, Pinxin Liu, Chenliang Xu

Given a short monocular video sequence and a written instruction about the avatar style, our model can generate a high-fidelity toonified avatar that can be driven in real-time by another video with arbitrary identities.

Contrastive Learning

Adaptive Super Resolution For One-Shot Talking-Head Generation

1 code implementation23 Mar 2024 Luchuan Song, Pinxin Liu, Guojun Yin, Chenliang Xu

In this work, we propose an adaptive high-quality talking-head video generation method, which synthesizes high-resolution video without additional pre-trained modules.

Decoder Super-Resolution +2

GaussianStyle: Gaussian Head Avatar via StyleGAN

1 code implementation1 Feb 2024 Pinxin Liu, Luchuan Song, Daoan Zhang, Hang Hua, Yunlong Tang, Huaijin Tu, Jiebo Luo, Chenliang Xu

Existing methods like Neural Radiation Fields (NeRF) and 3D Gaussian Splatting (3DGS) have made significant strides in facial attribute control such as facial animation and components editing, yet they struggle with fine-grained representation and scalability in dynamic head modeling.

3DGS Attribute +4

Tri$^{2}$-plane: Thinking Head Avatar via Feature Pyramid

1 code implementation17 Jan 2024 Luchuan Song, Pinxin Liu, Lele Chen, Guojun Yin, Chenliang Xu

Recent years have witnessed considerable achievements in facial avatar reconstruction with neural volume rendering.

Video Understanding with Large Language Models: A Survey

1 code implementation29 Dec 2023 Yunlong Tang, Jing Bi, Siting Xu, Luchuan Song, Susan Liang, Teng Wang, Daoan Zhang, Jie An, Jingyang Lin, Rongyi Zhu, Ali Vosoughi, Chao Huang, Zeliang Zhang, Pinxin Liu, Mingqian Feng, Feng Zheng, JianGuo Zhang, Ping Luo, Jiebo Luo, Chenliang Xu

With the burgeoning growth of online video platforms and the escalating volume of video content, the demand for proficient video understanding tools has intensified markedly.

Survey Video Understanding

IDRNet: Intervention-Driven Relation Network for Semantic Segmentation

1 code implementation NeurIPS 2023 Zhenchao Jin, Xiaowei Hu, Lingting Zhu, Luchuan Song, Li Yuan, Lequan Yu

Next, a deletion diagnostics procedure is conducted to model relations of these semantic-level representations via perceiving the network outputs and the extracted relations are utilized to guide the semantic-level representations to interact with each other.

Relation Relation Network +1

Emotional Listener Portrait: Neural Listener Head Generation with Emotion

no code implementations ICCV 2023 Luchuan Song, Guojun Yin, Zhenchao Jin, Xiaoyi Dong, Chenliang Xu

Listener head generation centers on generating non-verbal behaviors (e. g., smile) of a listener in reference to the information delivered by a speaker.

You Should Look at All Objects

1 code implementation16 Jul 2022 Zhenchao Jin, Dongdong Yu, Luchuan Song, Zehuan Yuan, Lequan Yu

Feature pyramid network (FPN) is one of the key components for object detectors.

All

Cascaded Residual Density Network for Crowd Counting

no code implementations29 Jul 2021 Kun Zhao, Luchuan Song, Bin Liu, Qi Chu, Nenghai Yu

Crowd counting is a challenging task due to the issues such as scale variation and perspective variation in real crowd scenes.

Crowd Counting

Abnormal Behavior Detection Based on Target Analysis

no code implementations29 Jul 2021 Luchuan Song, Bin Liu, Huihui Zhu, Qi Chu, Nenghai Yu

To this end, we propose a multivariate fusion method that analyzes each target through three branches: object, action and motion.

Object

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

2 code implementations CVPR 2021 Yinan He, Bei Gan, Siyu Chen, Yichun Zhou, Guojun Yin, Luchuan Song, Lu Sheng, Jing Shao, Ziwei Liu

To counter this emerging threat, we construct the ForgeryNet dataset, an extremely large face forgery dataset with unified annotations in image- and video-level data across four tasks: 1) Image Forgery Classification, including two-way (real / fake), three-way (real / fake with identity-replaced forgery approaches / fake with identity-remained forgery approaches), and n-way (real and 15 respective forgery approaches) classification.

Benchmarking Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.