Search Results for author: Yuhao Cheng

Found 12 papers, 6 papers with code

AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation

1 code implementation3 Jun 2024 Junhao Cheng, Xi Lu, Hanhui Li, Khun Loun Zai, Baiqiao Yin, Yuhao Cheng, Yiqiang Yan, Xiaodan Liang

As cutting-edge Text-to-Image (T2I) generation models already excel at producing remarkable single images, an even more challenging task, i. e., multi-turn interactive image generation begins to attract the attention of related research communities.

Image Generation

Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture

no code implementations1 Jun 2024 Xuanchen Li, Yuhao Cheng, Xingyu Ren, Haozhe Jia, Di Xu, Wenhan Zhu, Yichao Yan

To simplify this process, we propose Topo4D, a novel framework for automatic geometry and texture generation, which optimizes densely aligned 4D heads and 8K texture maps directly from calibrated multi-view time-series images.

8k Face Reconstruction +2

Rethink Predicting the Optical Flow with the Kinetics Perspective

1 code implementation21 May 2024 Yuhao Cheng, Siru Zhang, Yiqiang Yan

Furthermore, comprehensive experiments and ablation studies prove that the proposed novel insight into how to predict the optical flow can achieve the better performance of the state-of-the-art methods, and in some metrics, the proposed method outperforms the correlation-based method, especially in situations containing occlusion and fast moving.

Optical Flow Estimation

TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation

1 code implementation29 Apr 2024 Junhao Cheng, Baiqiao Yin, Kaixin Cai, Minbin Huang, Hanhui Li, Yuxin He, Xi Lu, Yue Li, Yifei Li, Yuhao Cheng, Yiqiang Yan, Xiaodan Liang

To address this issue, we introduce TheaterGen, a training-free framework that integrates large language models (LLMs) and text-to-image (T2I) models to provide the capability of multi-turn image generation.

Denoising Image Generation +2

ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving

1 code implementation25 Apr 2024 Jiehui Huang, Xiao Dong, Wenhui Song, Hanhui Li, Jun Zhou, Yuhao Cheng, Shutao Liao, Long Chen, Yiqiang Yan, Shengcai Liao, Xiaodan Liang

ConsistentID comprises two key components: a multimodal facial prompt generator that combines facial features, corresponding facial descriptions and the overall facial context to enhance precision in facial details, and an ID-preservation network optimized through the facial attention localization strategy, aimed at preserving ID consistency in facial regions.


3D-Aware Face Editing via Warping-Guided Latent Direction Learning

no code implementations CVPR 2024 Yuhao Cheng, Zhuo Chen, Xingyu Ren, Wenhan Zhu, Zhengqin Xu, Di Xu, Changpeng Yang, Yichao Yan

To address the problem of distortion caused by tri-plane warping we train a warp-aware encoder to project the warped face onto a standardized latent space.

Attribute Facial Editing

LSCD: A Large-Scale Screen Content Dataset for Video Compression

no code implementations18 Aug 2023 Yuhao Cheng, Siru Zhang, Yiqiang Yan, Rong Chen, Yun Zhang

Multimedia compression allows us to watch videos, see pictures and hear sounds within a limited bandwidth, which helps the flourish of the internet.

Video Compression

GANHead: Towards Generative Animatable Neural Head Avatars

no code implementations CVPR 2023 Sijing Wu, Yichao Yan, Yunhao Li, Yuhao Cheng, Wenhan Zhu, Ke Gao, Xiaobo Li, Guangtao Zhai

To bring digital avatars into people's lives, it is highly demanded to efficiently generate complete, realistic, and animatable head avatars.

Head3D: Complete 3D Head Generation via Tri-plane Feature Distillation

no code implementations28 Mar 2023 Yuhao Cheng, Yichao Yan, Wenhan Zhu, Ye Pan, Bowen Pan, Xiaokang Yang

Head generation with diverse identities is an important task in computer vision and computer graphics, widely used in multimedia applications.

Semantic Role Labeling with Associated Memory Network

1 code implementation NAACL 2019 Chaoyu Guan, Yuhao Cheng, Hai Zhao

Semantic role labeling (SRL) is a task to recognize all the predicate-argument pairs of a sentence, which has been in a performance improvement bottleneck after a series of latest works were presented.

Semantic Role Labeling Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.