3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting

no code implementations28 May 2024 Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, Ceyuan Yang

This results in a lack of a unified approach to effectively control and manipulate scenes at the 3D level with different levels of granularity.

Real-time 3D-aware Portrait Editing from a Single Image

no code implementations21 Feb 2024 Qingyan Bai, Zifan Shi, Yinghao Xu, Hao Ouyang, Qiuyu Wang, Ceyuan Yang, Xuan Wang, Gordon Wetzstein, Yujun Shen, Qifeng Chen

This work presents 3DPE, a practical method that can efficiently edit a face image following given prompts, like reference images or text descriptions, in a 3D-aware manner.

BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation

no code implementations4 Dec 2023 Qihang Zhang, Yinghao Xu, Yujun Shen, Bo Dai, Bolei Zhou, Ceyuan Yang

Generating large-scale 3D scenes cannot simply apply existing 3D object synthesis technique since 3D scenes usually hold complex spatial configurations and consist of a number of objects at varying scales.

SMaRt: Improving GANs with Score Matching Regularity

no code implementations30 Nov 2023 Mengfei Xia, Yujun Shen, Ceyuan Yang, Ran Yi, Wenping Wang, Yong-Jin Liu

In this work, we revisit the mathematical foundations of GANs, and theoretically reveal that the native adversarial loss for GAN training is insufficient to fix the problem of subsets with positive Lebesgue measure of the generated data manifold lying out of the real data manifold.


SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models

1 code implementation28 Nov 2023 Yuwei Guo, Ceyuan Yang, Anyi Rao, Maneesh Agrawala, Dahua Lin, Bo Dai

The development of text-to-video (T2V), i. e., generating videos with a given text prompt, has been significantly advanced in recent years.

LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models

2 code implementations26 Sep 2023 Yaohui Wang, Xinyuan Chen, Xin Ma, Shangchen Zhou, Ziqi Huang, Yi Wang, Ceyuan Yang, Yinan He, Jiashuo Yu, Peiqing Yang, Yuwei Guo, Tianxing Wu, Chenyang Si, Yuming Jiang, Cunjian Chen, Chen Change Loy, Bo Dai, Dahua Lin, Yu Qiao, Ziwei Liu

To this end, we propose LaVie, an integrated video generation framework that operates on cascaded video latent diffusion models, comprising a base T2V model, a temporal interpolation model, and a video super-resolution model.

Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis

1 code implementation7 Sep 2023 Jiapeng Zhu, Ceyuan Yang, Kecheng Zheng, Yinghao Xu, Zifan Shi, Yujun Shen

Due to the difficulty in scaling up, generative adversarial networks (GANs) seem to be falling from grace on the task of text-conditioned image synthesis.

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning

4 code implementations10 Jul 2023 Yuwei Guo, Ceyuan Yang, Anyi Rao, Zhengyang Liang, Yaohui Wang, Yu Qiao, Maneesh Agrawala, Dahua Lin, Bo Dai

Once trained, the motion module can be inserted into a personalized T2I model to form a personalized animation generator.

Spatial Steerability of GANs via Self-Supervision from Discriminator

no code implementations20 Jan 2023 Jianyuan Wang, Lalit Bhagat, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, Bolei Zhou

In this work, we propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space or requiring extra annotations.

GH-Feat: Learning Versatile Generative Hierarchical Features from GANs

no code implementations12 Jan 2023 Yinghao Xu, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou

In this work we investigate that such a generative feature learned from image synthesis exhibits great potentials in solving a wide range of computer vision tasks, including both generative ones and more importantly discriminative ones.

LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis

no code implementations ICCV 2023 Jiapeng Zhu, Ceyuan Yang, Yujun Shen, Zifan Shi, Bo Dai, Deli Zhao, Qifeng Chen

This work presents an easy-to-use regularizer for GAN training, which helps explicitly link some axes of the latent space to a set of pixels in the synthesized image.

Towards Smooth Video Composition

1 code implementation14 Dec 2022 Qihang Zhang, Ceyuan Yang, Yujun Shen, Yinghao Xu, Bolei Zhou

Video generation requires synthesizing consistent and persistent frames with dynamic content over time.

GLeaD: Improving GANs with A Generator-Leading Task

1 code implementation CVPR 2023 Qingyan Bai, Ceyuan Yang, Yinghao Xu, Xihui Liu, Yujiu Yang, Yujun Shen

Generative adversarial network (GAN) is formulated as a two-player game between a generator (G) and a discriminator (D), where D is asked to differentiate whether an image comes from real data or is produced by G. Under such a formulation, D plays as the rule maker and hence tends to dominate the competition.

Improving GANs with A Dynamic Discriminator

no code implementations20 Sep 2022 Ceyuan Yang, Yujun Shen, Yinghao Xu, Deli Zhao, Bo Dai, Bolei Zhou

Two capacity adjusting schemes are developed for training GANs under different data regimes: i) given a sufficient amount of training data, the discriminator benefits from a progressively increased learning capacity, and ii) when the training data is limited, gradually decreasing the layer width mitigates the over-fitting issue of the discriminator.

Accelerating Diffusion Models via Early Stop of the Diffusion Process

1 code implementation25 May 2022 Zhaoyang Lyu, Xudong Xu, Ceyuan Yang, Dahua Lin, Bo Dai

By modeling the reverse process of gradually diffusing the data distribution into a Gaussian distribution, generating a sample in DDPMs can be regarded as iteratively denoising a randomly sampled Gaussian noise.

3D-aware Image Synthesis via Learning Structural and Textural Representations

1 code implementation CVPR 2022 Yinghao Xu, Sida Peng, Ceyuan Yang, Yujun Shen, Bolei Zhou

The feature field is further accumulated into a 2D feature map as the textural representation, followed by a neural renderer for appearance synthesis.

Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition

no code implementations CVPR 2022 Yinghao Xu, Fangyun Wei, Xiao Sun, Ceyuan Yang, Yujun Shen, Bo Dai, Bolei Zhou, Stephen Lin

Typically in recent work, the pseudo-labels are obtained by training a model on the labeled data, and then using confident predictions from the model to teach itself.

Improving GAN Equilibrium by Raising Spatial Awareness

1 code implementation CVPR 2022 Jianyuan Wang, Ceyuan Yang, Yinghao Xu, Yujun Shen, Hongdong Li, Bolei Zhou

We further propose to align the spatial awareness of G with the attention map induced from D. Through this way we effectively lessen the information gap between D and G. Extensive results show that our method pushes the two-player game in GANs closer to the equilibrium, leading to a better synthesis performance.

One-Shot Generative Domain Adaptation

no code implementations ICCV 2023 Ceyuan Yang, Yujun Shen, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Zhirong Wu, Bolei Zhou

We then equip the well-learned discriminator backbone with an attribute classifier to ensure that the generator captures the appropriate characters from the reference.

Improving Out-of-Distribution Robustness of Classifiers Through Interpolated Generative Models

no code implementations29 Sep 2021 Haoyue Bai, Ceyuan Yang, Yinghao Xu, S.-H. Gary Chan, Bolei Zhou

In this paper, we employ interpolated generative models to generate OoD samples at training time via data augmentation.

Data-Efficient Instance Generation from Instance Discrimination

1 code implementation NeurIPS 2021 Ceyuan Yang, Yujun Shen, Yinghao Xu, Bolei Zhou

Meanwhile, the learned instance discrimination capability from the discriminator is in turn exploited to encourage the generator for diverse generation.

Instance Localization for Self-supervised Detection Pretraining

1 code implementation CVPR 2021 Ceyuan Yang, Zhirong Wu, Bolei Zhou, Stephen Lin

The pretext task is to predict the instance category given the composited images as well as the foreground bounding boxes.

Generative Hierarchical Features from Synthesizing Images

1 code implementation CVPR 2021 Yinghao Xu, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou

Generative Adversarial Networks (GANs) have recently advanced image synthesis by learning the underlying distribution of the observed data.

Unsupervised Landmark Learning from Unpaired Data

1 code implementation29 Jun 2020 Yinghao Xu, Ceyuan Yang, Ziwei Liu, Bo Dai, Bolei Zhou

Recent attempts for unsupervised landmark learning leverage synthesized image pairs that are similar in appearance but different in poses.

Video Representation Learning with Visual Tempo Consistency

1 code implementation28 Jun 2020 Ceyuan Yang, Yinghao Xu, Bo Dai, Bolei Zhou

Visual tempo, which describes how fast an action goes, has shown its potential in supervised action recognition.

InterFaceGAN: Interpreting the Disentangled Face Representation Learned by GANs

2 code implementations18 May 2020 Yujun Shen, Ceyuan Yang, Xiaoou Tang, Bolei Zhou

In this work, we propose a framework called InterFaceGAN to interpret the disentangled face representation learned by the state-of-the-art GAN models and study the properties of the facial semantics encoded in the latent space.

Temporal Pyramid Network for Action Recognition

3 code implementations CVPR 2020 Ceyuan Yang, Yinghao Xu, Jianping Shi, Bo Dai, Bolei Zhou

Previous works often capture the visual tempo through sampling raw videos at multiple rates and constructing an input-level frame pyramid, which usually requires a costly multi-branch network to handle.

Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis

2 code implementations21 Nov 2019 Ceyuan Yang, Yujun Shen, Bolei Zhou

Despite the success of Generative Adversarial Networks (GANs) in image synthesis, there lacks enough understanding on what generative models have learned inside the deep generative representations and how photo-realistic images are able to be composed of the layer-wise stochasticity introduced in recent GANs.

Learning Where to Focus for Efficient Video Object Detection

1 code implementation ECCV 2020 Zhengkai Jiang, Yu Liu, Ceyuan Yang, Jihao Liu, Peng Gao, Qian Zhang, Shiming Xiang, Chunhong Pan

Transferring existing image-based detectors to the video is non-trivial since the quality of frames is always deteriorated by part occlusion, rare pose, and motion blur.

Semantic Hierarchy Emerges in the Deep Generative Representations for Scene Synthesis

no code implementations25 Sep 2019 Ceyuan Yang, Yujun Shen, Bolei Zhou

Despite the success of Generative Adversarial Networks (GANs) in image synthesis, there lacks enough understanding on what networks have learned inside the deep generative representations and how photo-realistic images are able to be composed from random noises.

Penalizing Top Performers: Conservative Loss for Semantic Segmentation Adaptation

no code implementations ECCV 2018 Xinge Zhu, Hui Zhou, Ceyuan Yang, Jianping Shi, Dahua Lin

Due to the expensive and time-consuming annotations (e. g., segmentation) for real-world images, recent works in computer vision resort to synthetic data.

Pose Guided Human Video Generation

no code implementations ECCV 2018 Ceyuan Yang, Zhe Wang, Xinge Zhu, Chen Huang, Jianping Shi, Dahua Lin

Human pose, on the other hand, can represent motion patterns intrinsically and interpretably, and impose the geometric constraints regardless of appearance.

