Search Results for author: Fan Bao

Found 22 papers, 19 papers with code

ControlVideo: Conditional Control for One-shot Text-driven Video Editing and Beyond

1 code implementation26 May 2023 Min Zhao, Rongzhen Wang, Fan Bao, Chongxuan Li, Jun Zhu

This paper presents \emph{ControlVideo} for text-driven video editing -- generating a video that aligns with a given text while preserving the structure of the source video.

Text-to-Video Editing Video Editing

ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation

2 code implementations NeurIPS 2023 Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu

In comparison, VSD works well with various CFG weights as ancestral sampling from diffusion models and simultaneously improves the diversity and sample quality with a common CFG weight (i. e., $7. 5$).

3D Generation Text to 3D

A Closer Look at Parameter-Efficient Tuning in Diffusion Models

1 code implementation31 Mar 2023 Chendong Xiang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu

Large-scale diffusion models like Stable Diffusion are powerful and find various real-world applications while customizing such models by fine-tuning is both memory and time inefficient.

Efficient Diffusion Personalization Position

One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale

3 code implementations12 Mar 2023 Fan Bao, Shen Nie, Kaiwen Xue, Chongxuan Li, Shi Pu, Yaole Wang, Gang Yue, Yue Cao, Hang Su, Jun Zhu

Inspired by the unified view, UniDiffuser learns all distributions simultaneously with a minimal modification to the original diffusion model -- perturbs data in all modalities instead of a single modality, inputs individual timesteps in different modalities, and predicts the noise of all modalities instead of a single modality.

Text-to-Image Generation

Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels

2 code implementations NeurIPS 2023 Zebin You, Yong Zhong, Fan Bao, Jiacheng Sun, Chongxuan Li, Jun Zhu

In an effort to further advance semi-supervised generative and classification tasks, we propose a simple yet effective training strategy called dual pseudo training (DPT), built upon strong semi-supervised learners and diffusion models.

Classification

Revisiting Discriminative vs. Generative Classifiers: Theory and Implications

1 code implementation5 Feb 2023 Chenyu Zheng, Guoqiang Wu, Fan Bao, Yue Cao, Chongxuan Li, Jun Zhu

Theoretically, the paper considers the surrogate loss instead of the zero-one loss in analyses and generalizes the classical results from binary cases to multiclass ones.

Few-Shot Learning Image Classification +1

Why Are Conditional Generative Models Better Than Unconditional Ones?

no code implementations1 Dec 2022 Fan Bao, Chongxuan Li, Jiacheng Sun, Jun Zhu

Extensive empirical evidence demonstrates that conditional generative models are easier to train and perform better than unconditional ones by exploiting the labels of data.

DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models

1 code implementation2 Nov 2022 Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu

The commonly-used fast sampler for guided sampling is DDIM, a first-order diffusion ODE solver that generally needs 100 to 250 steps for high-quality samples.

Text-to-Image Generation

Equivariant Energy-Guided SDE for Inverse Molecular Design

2 code implementations30 Sep 2022 Fan Bao, Min Zhao, Zhongkai Hao, Peiyao Li, Chongxuan Li, Jun Zhu

Inverse molecular design is critical in material science and drug discovery, where the generated molecules should satisfy certain desirable properties.

3D Molecule Generation Drug Discovery

All are Worth Words: A ViT Backbone for Diffusion Models

3 code implementations CVPR 2023 Fan Bao, Shen Nie, Kaiwen Xue, Yue Cao, Chongxuan Li, Hang Su, Jun Zhu

We evaluate U-ViT in unconditional and class-conditional image generation, as well as text-to-image generation tasks, where U-ViT is comparable if not superior to a CNN-based U-Net of a similar size.

Conditional Image Generation Text-to-Image Generation

Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models

1 code implementation30 Aug 2022 Yong Zhong, Hongtao Liu, Xiaodong Liu, Fan Bao, Weiran Shen, Chongxuan Li

Deep generative models (DGMs) are data-eager because learning a complex model on limited data suffers from a large variance and easily overfits.

EGSDE: Unpaired Image-to-Image Translation via Energy-Guided Stochastic Differential Equations

1 code implementation14 Jul 2022 Min Zhao, Fan Bao, Chongxuan Li, Jun Zhu

Further, we provide an alternative explanation of the EGSDE as a product of experts, where each of the three experts (corresponding to the SDE and two feature extractors) solely contributes to faithfulness or realism.

Image-to-Image Translation Translation

Maximum Likelihood Training for Score-Based Diffusion ODEs by High-Order Denoising Score Matching

1 code implementation16 Jun 2022 Cheng Lu, Kaiwen Zheng, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu

To fill up this gap, we show that the negative likelihood of the ODE can be bounded by controlling the first, second, and third-order score matching errors; and we further present a novel high-order denoising score matching method to enable maximum likelihood training of score-based diffusion ODEs.

Denoising

Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models

1 code implementation15 Jun 2022 Fan Bao, Chongxuan Li, Jiacheng Sun, Jun Zhu, Bo Zhang

Thus, the generation performance on a subset of timesteps is crucial, which is greatly influenced by the covariance design in DPMs.

Computational Efficiency

Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models

2 code implementations ICLR 2022 Fan Bao, Chongxuan Li, Jun Zhu, Bo Zhang

In this work, we present a surprising result that both the optimal reverse variance and the corresponding optimal KL divergence of a DPM have analytic forms w. r. t.

Stability and Generalization of Bilevel Programming in Hyperparameter Optimization

1 code implementation NeurIPS 2021 Fan Bao, Guoqiang Wu, Chongxuan Li, Jun Zhu, Bo Zhang

Our results can explain some mysterious behaviours of the bilevel programming in practice, for instance, overfitting to the validation set.

Hyperparameter Optimization

Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models

1 code implementation NeurIPS Workshop ICBINB 2020 Fan Bao, Kun Xu, Chongxuan Li, Lanqing Hong, Jun Zhu, Bo Zhang

The learning and evaluation of energy-based latent variable models (EBLVMs) without any structural assumptions are highly challenging, because the true posteriors and the partition functions in such models are generally intractable.

Bi-level Score Matching for Learning Energy-based Latent Variable Models

1 code implementation NeurIPS 2020 Fan Bao, Chongxuan Li, Kun Xu, Hang Su, Jun Zhu, Bo Zhang

This paper presents a bi-level score matching (BiSM) method to learn EBLVMs with general structures by reformulating SM as a bi-level optimization problem.

Rolling Shutter Correction Stochastic Optimization

Boosting Generative Models by Leveraging Cascaded Meta-Models

1 code implementation11 May 2019 Fan Bao, Hang Su, Jun Zhu

Besides, our framework can be extended to semi-supervised boosting, where the boosted model learns a joint distribution of data and labels.

Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples

no code implementations25 Jan 2019 Yinpeng Dong, Fan Bao, Hang Su, Jun Zhu

3) We propose to improve the consistency of neurons on adversarial example subset by an adversarial training algorithm with a consistent loss.

Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples

no code implementations18 Aug 2017 Yinpeng Dong, Hang Su, Jun Zhu, Fan Bao

We find that: (1) the neurons in DNNs do not truly detect semantic objects/parts, but respond to objects/parts only as recurrent discriminative patches; (2) deep visual representations are not robust distributed codes of visual concepts because the representations of adversarial images are largely not consistent with those of real images, although they have similar visual appearance, both of which are different from previous findings.

Cannot find the paper you are looking for? You can Submit a new open access paper.