Search Results for author: Deli Zhao

Found 42 papers, 16 papers with code

VideoComposer: Compositional Video Synthesis with Motion Controllability

no code implementations3 Jun 2023 Xiang Wang, Hangjie Yuan, Shiwei Zhang, Dayou Chen, Jiuniu Wang, Yingya Zhang, Yujun Shen, Deli Zhao, Jingren Zhou

The pursuit of controllability as a higher standard of visual content creation has yielded remarkable progress in customizable image synthesis.

Image Generation

Cones 2: Customizable Image Synthesis with Multiple Subjects

no code implementations30 May 2023 Zhiheng Liu, Yifei Zhang, Yujun Shen, Kecheng Zheng, Kai Zhu, Ruili Feng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao

Synthesizing images with user-specified subjects has received growing attention due to its practical applications.

Image Generation

MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot Action Recognition

1 code implementation CVPR 2023 Xiang Wang, Shiwei Zhang, Zhiwu Qing, Changxin Gao, Yingya Zhang, Deli Zhao, Nong Sang

To address these issues, we develop a Motion-augmented Long-short Contrastive Learning (MoLo) method that contains two crucial components, including a long-short contrastive objective and a motion autodecoder.

Contrastive Learning Few-Shot action recognition +1

VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation

1 code implementation CVPR 2023 Zhengxiong Luo, Dayou Chen, Yingya Zhang, Yan Huang, Liang Wang, Yujun Shen, Deli Zhao, Jingren Zhou, Tieniu Tan

A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data distribution.

Denoising Image Generation +2

Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos

1 code implementation15 Mar 2023 Yulin Pan, Xiangteng He, Biao Gong, Yiliang Lv, Yujun Shen, Yuxin Peng, Deli Zhao

Video temporal grounding aims to pinpoint a video segment that matches the query description.

ViM: Vision Middleware for Unified Downstream Transferring

no code implementations13 Mar 2023 Yutong Feng, Biao Gong, Jianwen Jiang, Yiliang Lv, Yujun Shen, Deli Zhao, Jingren Zhou

ViM consists of a zoo of lightweight plug-in modules, each of which is independently learned on a midstream dataset with a shared frozen backbone.

Cones: Concept Neurons in Diffusion Models for Customized Generation

2 code implementations9 Mar 2023 Zhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao

Concatenating multiple clusters of concept neurons can vividly generate all related concepts in a single image.

CLIP-guided Prototype Modulating for Few-shot Action Recognition

1 code implementation6 Mar 2023 Xiang Wang, Shiwei Zhang, Jun Cen, Changxin Gao, Yingya Zhang, Deli Zhao, Nong Sang

Learning from large-scale contrastive language-image pre-training like CLIP has shown remarkable success in a wide range of downstream tasks recently, but it is still under-explored on the challenging few-shot action recognition (FSAR) task.

Few-Shot action recognition Few Shot Action Recognition

Rethinking Efficient Tuning Methods from a Unified Perspective

no code implementations1 Mar 2023 Zeyinzi Jiang, Chaojie Mao, Ziyuan Huang, Yiliang Lv, Deli Zhao, Jingren Zhou

The U-Tuning framework can simultaneously encompass existing methods and derive new approaches for parameter-efficient transfer learning, which prove to achieve on-par or better performances on CIFAR-100 and FGVC datasets when compared with existing PETL methods.

Transfer Learning

Composer: Creative and Controllable Image Synthesis with Composable Conditions

2 code implementations20 Feb 2023 Lianghua Huang, Di Chen, Yu Liu, Yujun Shen, Deli Zhao, Jingren Zhou

Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability.

Image Colorization Image-to-Image Translation +3

UKnow: A Unified Knowledge Protocol for Common-Sense Reasoning and Vision-Language Pre-training

1 code implementation14 Feb 2023 Biao Gong, Xiaoying Xie, Yutong Feng, Yiliang Lv, Yujun Shen, Deli Zhao

This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data.

Common Sense Reasoning

The Devil is in the Wrongly-classified Samples: Towards Unified Open-set Recognition

1 code implementation8 Feb 2023 Jun Cen, Di Luan, Shiwei Zhang, Yixuan Pei, Yingya Zhang, Deli Zhao, Shaojie Shen, Qifeng Chen

Recently, Unified Open-set Recognition (UOSR) has been proposed to reject not only unknown samples but also known but wrongly classified samples, which tends to be more practical in real-world applications.

Open Set Learning

LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis

no code implementations11 Jan 2023 Jiapeng Zhu, Ceyuan Yang, Yujun Shen, Zifan Shi, Deli Zhao, Qifeng Chen

(1) Any image region can be linked to the latent space, even if the region is pre-selected before training and fixed for all instances.

Image Generation

RA-CLIP: Retrieval Augmented Contrastive Language-Image Pre-Training

no code implementations CVPR 2023 Chen-Wei Xie, Siyang Sun, Xiong Xiong, Yun Zheng, Deli Zhao, Jingren Zhou

This process can be considered as an open-book exam: with the reference set as a cheat sheet, the proposed method doesn't need to memorize all visual concepts in the training data.

Classification Image Classification +5

Dimensionality-Varying Diffusion Process

no code implementations CVPR 2023 Han Zhang, Ruili Feng, Zhantao Yang, Lianghua Huang, Yu Liu, Yifei Zhang, Yujun Shen, Deli Zhao, Jingren Zhou, Fan Cheng

Diffusion models, which learn to reverse a signal destruction process to generate new data, typically require the signal at each step to have the same dimension.

Image Generation

Neural Dependencies Emerging from Learning Massive Categories

no code implementations CVPR 2023 Ruili Feng, Kecheng Zheng, Kai Zhu, Yujun Shen, Jian Zhao, Yukun Huang, Deli Zhao, Jingren Zhou, Michael Jordan, Zheng-Jun Zha

Through investigating the properties of the problem solution, we confirm that neural dependency is guaranteed by a redundant logit covariance matrix, which condition is easily met given massive categories, and that neural dependency is highly sparse, implying that one category correlates to only a few others.

Image Classification

Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator

no code implementations30 Sep 2022 Zifan Shi, Yinghao Xu, Yujun Shen, Deli Zhao, Qifeng Chen, Dit-yan Yeung

We argue that, considering the two-player game in the formulation of GANs, only making the generator 3D-aware is not enough.

3D-Aware Image Synthesis Novel View Synthesis

Improving GANs with A Dynamic Discriminator

no code implementations20 Sep 2022 Ceyuan Yang, Yujun Shen, Yinghao Xu, Deli Zhao, Bo Dai, Bolei Zhou

Two capacity adjusting schemes are developed for training GANs under different data regimes: i) given a sufficient amount of training data, the discriminator benefits from a progressively increased learning capacity, and ii) when the training data is limited, gradually decreasing the layer width mitigates the over-fitting issue of the discriminator.

3D-Aware Image Synthesis Data Augmentation

Rank Diminishing in Deep Neural Networks

no code implementations13 Jun 2022 Ruili Feng, Kecheng Zheng, Yukun Huang, Deli Zhao, Michael Jordan, Zheng-Jun Zha

By virtue of our numerical tools, we provide the first empirical analysis of the per-layer behavior of network rank in practical settings, i. e., ResNets, deep MLPs, and Transformers on ImageNet.

Principled Knowledge Extrapolation with GANs

no code implementations21 May 2022 Ruili Feng, Jie Xiao, Kecheng Zheng, Deli Zhao, Jingren Zhou, Qibin Sun, Zheng-Jun Zha

Human can extrapolate well, generalize daily knowledge into unseen scenarios, raise and answer counterfactual questions.

Region-Based Semantic Factorization in GANs

1 code implementation19 Feb 2022 Jiapeng Zhu, Yujun Shen, Yinghao Xu, Deli Zhao, Qifeng Chen

Despite the rapid advancement of semantic discovery in the latent space of Generative Adversarial Networks (GANs), existing approaches either are limited to finding global attributes or rely on a number of segmentation masks to identify local attributes.

Low-Rank Subspaces in GANs

1 code implementation NeurIPS 2021 Jiapeng Zhu, Ruili Feng, Yujun Shen, Deli Zhao, ZhengJun Zha, Jingren Zhou, Qifeng Chen

Concretely, given an arbitrary image and a region of interest (e. g., eyes of face images), we manage to relate the latent space to the image region with the Jacobian matrix and then use low-rank factorization to discover steerable latent subspaces.

On Noise Injection in Generative Adversarial Networks

2 code implementations10 Jun 2020 Ruili Feng, Deli Zhao, ZhengJun Zha

Noise injection has been proved to be one of the key technique advances in generating high-fidelity images.

Image Generation

In-Domain GAN Inversion for Real Image Editing

2 code implementations ECCV 2020 Jiapeng Zhu, Yujun Shen, Deli Zhao, Bolei Zhou

A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.

Image Reconstruction

Perceptual Image Super-Resolution with Progressive Adversarial Network

no code implementations8 Mar 2020 Lone Wong, Deli Zhao, Shaohua Wan, Bo Zhang

Progressive growing enhances image resolution gradually, thereby preserving precision of recovered image.

Image Super-Resolution

Latent Variables on Spheres for Autoencoders in High Dimensions

no code implementations21 Dec 2019 Deli Zhao, Jiapeng Zhu, Bo Zhang

Variational Auto-Encoder (VAE) has been widely applied as a fundamental generative model in machine learning.

Vocal Bursts Intensity Prediction

LIA: Latently Invertible Autoencoder with Adversarial Learning

no code implementations25 Sep 2019 Jiapeng Zhu, Deli Zhao, Bolei Zhou, Bo Zhang

A two-stage stochasticity-free training scheme is designed to train LIA via adversarial learning, in the sense that the decoder of LIA is first trained as a standard GAN with the invertible network and then the partial encoder is learned from an autoencoder by detaching the invertible network from LIA.

Variational Inference

Latent Variables on Spheres for Sampling and Inference

no code implementations25 Sep 2019 Deli Zhao, Jiapeng Zhu, Bo Zhang

Variational inference is a fundamental problem in Variational AutoEncoder (VAE).

Variational Inference

Curriculum Learning for Deep Generative Models with Clustering

no code implementations27 Jun 2019 Deli Zhao, Jiapeng Zhu, Zhenfang Guo, Bo Zhang

The experiments on cat and human-face data validate that our algorithm is able to learn the optimal generative models (e. g. ProGAN) with respect to specified quality metrics for noisy data.

Clustering

Disentangled Inference for GANs with Latently Invertible Autoencoder

3 code implementations19 Jun 2019 Jiapeng Zhu, Deli Zhao, Bo Zhang, Bolei Zhou

In this paper, we show that the entanglement of the latent space for the VAE/GAN framework poses the main challenge for encoder learning.

DeepExposure: Learning to Expose Photos with Asynchronously Reinforced Adversarial Learning

no code implementations NeurIPS 2018 Runsheng Yu, Wenyu Liu, Yasen Zhang, Zhi Qu, Deli Zhao, Bo Zhang

Based on these sub-images, a local exposure for each sub-image is automatically learned by virtue of policy network sequentially while the reward of learning is globally designed for striking a balance of overall exposures.

Few Shot Learning with Simplex

no code implementations27 Jul 2018 Bowen Zhang, Xifan Zhang, Fan Cheng, Deli Zhao

During testing, combined with the test sample and the points in the class, a new simplex is formed.

Few-Shot Learning

Sparse Coding and Dictionary Learning With Linear Dynamical Systems

no code implementations CVPR 2016 Wenbing Huang, Fuchun Sun, Lele Cao, Deli Zhao, Huaping Liu, Mehrtash Harandi

To enhance the performance of LDSs, in this paper, we address the challenging issue of performing sparse coding on the space of LDSs, where both data and dictionary atoms are LDSs.

Dictionary Learning Video Classification

Zeta Hull Pursuits: Learning Nonconvex Data Hulls

no code implementations NeurIPS 2014 Yuanjun Xiong, Wei Liu, Deli Zhao, Xiaoou Tang

Selecting a small informative subset from a given dataset, also called column sampling, has drawn much attention in machine learning.

Image Classification

Errata: Distant Supervision for Relation Extraction with Matrix Completion

no code implementations17 Nov 2014 Miao Fan, Deli Zhao, Qiang Zhou, Zhiyuan Liu, Thomas Fang Zheng, Edward Y. Chang

The essence of distantly supervised relation extraction is that it is an incomplete multi-label classification problem with sparse and noisy features.

Classification General Classification +3

Homophilic Clustering by Locally Asymmetric Geometry

no code implementations5 Jul 2014 Deli Zhao, Xiaoou Tang

Clustering is indispensable for data analysis in many scientific disciplines.

Clustering

Graph Degree Linkage: Agglomerative Clustering on a Directed Graph

2 code implementations25 Aug 2012 Wei Zhang, Xiaogang Wang, Deli Zhao, Xiaoou Tang

We explore the different roles of two fundamental concepts in graph theory, indegree and outdegree, in the context of clustering.

 Ranked #1 on Image Clustering on Coil-20 (Accuracy metric)

Clustering Image Clustering

Cyclizing Clusters via Zeta Function of a Graph

no code implementations NeurIPS 2008 Deli Zhao, Xiaoou Tang

A mathematical tool, Zeta function of a graph, is introduced for the integration of all cycles, leading to a structural descriptor of the cluster in determinantal form.

Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.