Search Results for author: Yuwei Guo

Found 13 papers, 5 papers with code

HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation

1 code implementation24 Jul 2024 Zhenzhi Wang, Yixuan Li, Yanhong Zeng, Youqing Fang, Yuwei Guo, Wenran Liu, Jing Tan, Kai Chen, Tianfan Xue, Bo Dai, Dahua Lin

Moreover, these approaches prioritize 2D human motion and overlook the significance of camera motions in videos, leading to limited control and unstable video generation.

Benchmarking Image Animation +1

Fast and Efficient: Mask Neural Fields for 3D Scene Segmentation

no code implementations1 Jul 2024 Zihan Gao, Lingling Li, Licheng Jiao, Fang Liu, Xu Liu, Wenping Ma, Yuwei Guo, Shuyuan Yang

Recent advancements in distilling 2D vision-language foundation models into neural fields, like NeRF and 3DGS, enables open-vocabulary segmentation of 3D scenes from 2D multi-view images without the need for precise 3D annotations.

Scene Segmentation

Multiplane Prior Guided Few-Shot Aerial Scene Rendering

no code implementations CVPR 2024 Zihan Gao, Licheng Jiao, Lingling Li, Xu Liu, Fang Liu, Puhua Chen, Yuwei Guo

By investigating NeRF's and Multiplane Image (MPI)'s behavior, we propose to guide the training process of NeRF with a Multiplane Prior.

Image Comprehension SSIM

Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing

no code implementations7 May 2024 Yi Zuo, Lingling Li, Licheng Jiao, Fang Liu, Xu Liu, Wenping Ma, Shuyuan Yang, Yuwei Guo

In the first training stage, we focus on learning the spatial features (the features of object content) and breaking down the temporal relationships in the video frames by shuffling them.

Object Video Editing

Transferring Modality-Aware Pedestrian Attentive Learning for Visible-Infrared Person Re-identification

no code implementations12 Dec 2023 Yuwei Guo, WenHao Zhang, Licheng Jiao, Shuang Wang, Shuo Wang, Fang Liu

Visible-infrared person re-identification (VI-ReID) aims to search the same pedestrian of interest across visible and infrared modalities.

Data Augmentation Person Re-Identification

SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models

1 code implementation28 Nov 2023 Yuwei Guo, Ceyuan Yang, Anyi Rao, Maneesh Agrawala, Dahua Lin, Bo Dai

The development of text-to-video (T2V), i. e., generating videos with a given text prompt, has been significantly advanced in recent years.

Video Generation

LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models

2 code implementations26 Sep 2023 Yaohui Wang, Xinyuan Chen, Xin Ma, Shangchen Zhou, Ziqi Huang, Yi Wang, Ceyuan Yang, Yinan He, Jiashuo Yu, Peiqing Yang, Yuwei Guo, Tianxing Wu, Chenyang Si, Yuming Jiang, Cunjian Chen, Chen Change Loy, Bo Dai, Dahua Lin, Yu Qiao, Ziwei Liu

To this end, we propose LaVie, an integrated video generation framework that operates on cascaded video latent diffusion models, comprising a base T2V model, a temporal interpolation model, and a video super-resolution model.

Text-to-Video Generation Video Generation +1

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning

7 code implementations10 Jul 2023 Yuwei Guo, Ceyuan Yang, Anyi Rao, Zhengyang Liang, Yaohui Wang, Yu Qiao, Maneesh Agrawala, Dahua Lin, Bo Dai

Once trained, the motion module can be inserted into a personalized T2I model to form a personalized animation generator.

Image Animation

Dynamic Storyboard Generation in an Engine-based Virtual Environment for Video Production

no code implementations30 Jan 2023 Anyi Rao, Xuekun Jiang, Yuwei Guo, Linning Xu, Lei Yang, Libiao Jin, Dahua Lin, Bo Dai

Amateurs working on mini-films and short-form videos usually spend lots of time and effort on the multi-round complicated process of setting and adjusting scenes, plots, and cameras to deliver satisfying video shots.

Temporal and Contextual Transformer for Multi-Camera Editing of TV Shows

no code implementations17 Oct 2022 Anyi Rao, Xuekun Jiang, Sichen Wang, Yuwei Guo, Zihao Liu, Bo Dai, Long Pang, Xiaoyu Wu, Dahua Lin, Libiao Jin

The ability to choose an appropriate camera view among multiple cameras plays a vital role in TV shows delivery.

More Separable and Easier to Segment: A Cluster Alignment Method for Cross-Domain Semantic Segmentation

no code implementations7 May 2021 Shuang Wang, Dong Zhao, Yi Li, Chi Zhang, Yuwei Guo, Qi Zang, Biao Hou, Licheng Jiao

Feature alignment between domains is one of the mainstream methods for Unsupervised Domain Adaptation (UDA) semantic segmentation.

Clustering Segmentation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.