Search Results for author: Chen Change Loy

Found 263 papers, 177 papers with code

Efficient Diffusion Model for Image Restoration by Residual Shifting

1 code implementation12 Mar 2024 Zongsheng Yue, Jianyi Wang, Chen Change Loy

While diffusion-based image restoration (IR) methods have achieved remarkable success, they are still limited by the low inference speed attributed to the necessity of executing hundreds or even thousands of sampling steps.

Blind Face Restoration Image Inpainting +2

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition

1 code implementation3 Mar 2024 Kun-Yu Lin, Henghui Ding, Jiaming Zhou, Yi-Xing Peng, Zhilin Zhao, Chen Change Loy, Wei-Shi Zheng

To answer this, we establish a CROSS-domain Open-Vocabulary Action recognition benchmark named XOV-Action, and conduct a comprehensive evaluation of five state-of-the-art CLIP-based video learners under various types of domain gaps.

Open Vocabulary Action Recognition

Control Color: Multimodal Diffusion-based Interactive Image Colorization

no code implementations16 Feb 2024 Zhexin Liang, Zhaochen Li, Shangchen Zhou, Chongyi Li, Chen Change Loy

We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring.

Colorization Color Manipulation +1

OMG-Seg: Is One Model Good Enough For All Segmentation?

1 code implementation18 Jan 2024 Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang, Yining Li, Kai Chen, Chen Change Loy

In this work, we address various segmentation tasks, each traditionally tackled by distinct or partially unified models.

Interactive Segmentation Panoptic Segmentation +3

Towards Language-Driven Video Inpainting via Multimodal Large Language Models

no code implementations18 Jan 2024 Jianzong Wu, Xiangtai Li, Chenyang Si, Shangchen Zhou, Jingkang Yang, Jiangning Zhang, Yining Li, Kai Chen, Yunhai Tong, Ziwei Liu, Chen Change Loy

We introduce a new task -- language-driven video inpainting, which uses natural language instructions to guide the inpainting process.

Video Inpainting

CLIM: Contrastive Language-Image Mosaic for Region Representation

1 code implementation18 Dec 2023 Size Wu, Wenwei Zhang, Lumin Xu, Sheng Jin, Wentao Liu, Chen Change Loy

Our experimental results demonstrate that CLIM improves different baseline open-vocabulary object detectors by a large margin on both OV-COCO and OV-LVIS benchmarks.

Object object-detection +1

Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution

no code implementations11 Dec 2023 Shangchen Zhou, Peiqing Yang, Jianyi Wang, Yihang Luo, Chen Change Loy

Text-based diffusion models have exhibited remarkable success in generation and editing, showing great promise for enhancing visual content with their generative prior.

Video Super-Resolution

EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM

1 code implementation11 Dec 2023 Chong Zhou, Xiangtai Li, Chen Change Loy, Bo Dai

It is also the first SAM variant that can run at over 30 FPS on an iPhone 14.

Digital Life Project: Autonomous 3D Characters with Social Intelligence

no code implementations7 Dec 2023 Zhongang Cai, Jianping Jiang, Zhongfei Qing, Xinying Guo, Mingyuan Zhang, Zhengyu Lin, Haiyi Mei, Chen Wei, Ruisi Wang, Wanqi Yin, Xiangyu Fan, Han Du, Liang Pan, Peng Gao, Zhitao Yang, Yang Gao, Jiaqi Li, Tianxiang Ren, Yukun Wei, Xiaogang Wang, Chen Change Loy, Lei Yang, Ziwei Liu

In this work, we present Digital Life Project, a framework utilizing language as the universal medium to build autonomous 3D characters, who are capable of engaging in social interactions and expressing with articulated body motions, thereby simulating life in a digital environment.

Motion Captioning Motion Synthesis

Gaussian3Diff: 3D Gaussian Diffusion for 3D Full Head Synthesis and Editing

no code implementations5 Dec 2023 Yushi Lan, Feitong Tan, Di Qiu, Qiangeng Xu, Kyle Genova, Zeng Huang, Sean Fanello, Rohit Pandey, Thomas Funkhouser, Chen Change Loy, yinda zhang

We present a novel framework for generating photorealistic 3D human head and subsequently manipulating and reposing them with remarkable flexibility.

Face Model

VideoBooth: Diffusion-based Video Generation with Image Prompts

no code implementations1 Dec 2023 Yuming Jiang, Tianxing Wu, Shuai Yang, Chenyang Si, Dahua Lin, Yu Qiao, Chen Change Loy, Ziwei Liu

In this paper, we study the task of video generation with image prompts, which provide more accurate and direct content control beyond the text prompts.

Video Generation

When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for Personalized Image Generation

1 code implementation29 Nov 2023 Xiaoming Li, Xinyu Hou, Chen Change Loy

Text-to-image diffusion models have remarkably excelled in producing diverse, high-quality, and photo-realistic images.

Attribute Disentanglement +1

Panoptic Video Scene Graph Generation

3 code implementations CVPR 2023 Jingkang Yang, Wenxuan Peng, Xiangtai Li, Zujin Guo, Liangyu Chen, Bo Li, Zheng Ma, Kaiyang Zhou, Wayne Zhang, Chen Change Loy, Ziwei Liu

PVSG relates to the existing video scene graph generation (VidSGG) problem, which focuses on temporal interactions between humans and objects grounded with bounding boxes in videos.

Graph Generation Panoptic Scene Graph Generation +5

PERF: Panoramic Neural Radiance Field from a Single Panorama

1 code implementation25 Oct 2023 Guangcong Wang, Peng Wang, Zhaoxi Chen, Wenping Wang, Chen Change Loy, Ziwei Liu

In this paper, we present PERF, a 360-degree novel view synthesis framework that trains a panoramic neural radiance field from a single panorama.

Novel View Synthesis Text to 3D

PaintHuman: Towards High-fidelity Text-to-3D Human Texturing via Denoised Score Distillation

no code implementations14 Oct 2023 Jianhui Yu, Hao Zhu, Liming Jiang, Chen Change Loy, Weidong Cai, Wayne Wu

We first propose a novel score function, Denoised Score Distillation (DSD), which directly modifies the SDS by introducing negative gradient components to iteratively correct the gradient direction and generate high-quality textures.

Text to 3D text-to-3d-human +1

CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

1 code implementation2 Oct 2023 Size Wu, Wenwei Zhang, Lumin Xu, Sheng Jin, Xiangtai Li, Wentao Liu, Chen Change Loy

However, when transferring the vision-language alignment of CLIP from global image representation to local region representation for the open-vocabulary dense prediction tasks, CLIP ViTs suffer from the domain shift from full images to local image regions.

Image Classification Image Segmentation +7

DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection

1 code implementation2 Oct 2023 Shilin Xu, Xiangtai Li, Size Wu, Wenwei Zhang, Yining Li, Guangliang Cheng, Yunhai Tong, Kai Chen, Chen Change Loy

This work presents a simple yet effective strategy that leverages the zero-shot classification ability of pre-trained vision-language models (VLM), such as CLIP, to directly discover proposals of possible novel classes.

Novel Object Detection Object +5

Deep Geometrized Cartoon Line Inbetweening

1 code implementation ICCV 2023 Li SiYao, Tianpei Gu, Weiye Xiao, Henghui Ding, Ziwei Liu, Chen Change Loy

To preserve the precision and detail of the line drawings, we propose a new approach, AnimeInbet, which geometrizes raster line drawings into graphs of endpoints and reframes the inbetweening task as a graph fusion problem with vertex repositioning.

LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models

2 code implementations26 Sep 2023 Yaohui Wang, Xinyuan Chen, Xin Ma, Shangchen Zhou, Ziqi Huang, Yi Wang, Ceyuan Yang, Yinan He, Jiashuo Yu, Peiqing Yang, Yuwei Guo, Tianxing Wu, Chenyang Si, Yuming Jiang, Cunjian Chen, Chen Change Loy, Bo Dai, Dahua Lin, Yu Qiao, Ziwei Liu

To this end, we propose LaVie, an integrated video generation framework that operates on cascaded video latent diffusion models, comprising a base T2V model, a temporal interpolation model, and a video super-resolution model.

Text-to-Video Generation Video Generation +1

MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation

1 code implementation22 Sep 2023 Jiahao Xie, Wei Li, Xiangtai Li, Ziwei Liu, Yew Soon Ong, Chen Change Loy

We present MosaicFusion, a simple yet effective diffusion-based data augmentation approach for large vocabulary instance segmentation.

Data Augmentation Instance Segmentation +1

Interpret Vision Transformers as ConvNets with Dynamic Convolutions

no code implementations19 Sep 2023 Chong Zhou, Chen Change Loy, Bo Dai

There has been a debate about the superiority between vision Transformers and ConvNets, serving as the backbone of computer vision models.

PGDiff: Guiding Diffusion Models for Versatile Face Restoration via Partial Guidance

1 code implementation NeurIPS 2023 Peiqing Yang, Shangchen Zhou, Qingyi Tao, Chen Change Loy

When combined with a diffusion prior, this partial guidance can deliver appealing results across a range of restoration tasks.

DeformToon3D: Deformable 3D Toonification from Neural Radiance Fields

1 code implementation8 Sep 2023 Junzhe Zhang, Yushi Lan, Shuai Yang, Fangzhou Hong, Quan Wang, Chai Kiat Yeo, Ziwei Liu, Chen Change Loy

In this paper, we address the challenging problem of 3D toonification, which involves transferring the style of an artistic domain onto a target 3D face with stylized geometry and texture.

Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis

no code implementations31 Aug 2023 Linsen Song, Wayne Wu, Chaoyou Fu, Chen Change Loy, Ran He

Existing automated dubbing methods are usually designed for Professionally Generated Content (PGC) production, which requires massive training data and training time to learn a person-specific audio-video mapping.

PointHPS: Cascaded 3D Human Pose and Shape Estimation from Point Clouds

no code implementations28 Aug 2023 Zhongang Cai, Liang Pan, Chen Wei, Wanqi Yin, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu

To tackle these challenges, we propose a principled framework, PointHPS, for accurate 3D HPS from point clouds captured in real-world settings, which iteratively refines point features through a cascaded architecture.

3D human pose and shape estimation

Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation

1 code implementation ICCV 2023 Yuxin Jiang, Liming Jiang, Shuai Yang, Chen Change Loy

The challenges of this task lie in the complexity of the scenes, the unique features of anime style, and the lack of high-quality datasets to bridge the domain gap.

Image-to-Image Translation

MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions

1 code implementation ICCV 2023 Henghui Ding, Chang Liu, Shuting He, Xudong Jiang, Chen Change Loy

To investigate the feasibility of using motion expressions to ground and segment objects in videos, we propose a large-scale dataset called MeViS, which contains numerous motion expressions to indicate target objects in complex environments.

Motion Expressions Guided Video Segmentation Object +6

ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting

1 code implementation NeurIPS 2023 Zongsheng Yue, Jianyi Wang, Chen Change Loy

Diffusion-based image super-resolution (SR) methods are mainly limited by the low inference speed due to the requirements of hundreds or even thousands of sampling steps.

Image Super-Resolution

Adaptive Window Pruning for Efficient Local Motion Deblurring

no code implementations25 Jun 2023 Haoying Li, Jixin Zhao, Shangchen Zhou, Huajun Feng, Chongyi Li, Chen Change Loy

Existing image deblurring methods predominantly focus on global deblurring, inadvertently affecting the sharpness of backgrounds in locally blurred images and wasting unnecessary computation on sharp pixels, especially for high-resolution images.

Deblurring Image Deblurring

Explore In-Context Learning for 3D Point Cloud Understanding

1 code implementation NeurIPS 2023 Zhongbin Fang, Xiangtai Li, Xia Li, Joachim M. Buhmann, Chen Change Loy, Mengyuan Liu

With the rise of large-scale models trained on broad data, in-context learning has become a new learning paradigm that has demonstrated significant potential in natural language processing and computer vision tasks.

In-Context Learning

GP-UNIT: Generative Prior for Versatile Unsupervised Image-to-Image Translation

1 code implementation7 Jun 2023 Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy

In this paper, we introduce a novel versatile framework, Generative Prior-guided UNsupervised Image-to-image Translation (GP-UNIT), that improves the quality, applicability and controllability of the existing translation models.

Translation Unsupervised Image-To-Image Translation +1

Flare7K++: Mixing Synthetic and Real Datasets for Nighttime Flare Removal and Beyond

1 code implementation7 Jun 2023 Yuekun Dai, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Yihang Luo, Chen Change Loy

To address this issue, we additionally provide the annotations of light sources in Flare7K++ and propose a new end-to-end pipeline to preserve the light source while removing lens flares.

Flare Removal

Contextual Object Detection with Multimodal Large Language Models

1 code implementation29 May 2023 Yuhang Zang, Wei Li, Jun Han, Kaiyang Zhou, Chen Change Loy

Moreover, we present ContextDET, a unified multimodal model that is capable of end-to-end differentiable modeling of visual-language contexts, so as to locate, identify, and associate visual objects with language inputs for human-AI interaction.

Cloze Test Image Captioning +6

Semi-Supervised and Long-Tailed Object Detection with CascadeMatch

no code implementations24 May 2023 Yuhang Zang, Kaiyang Zhou, Chen Huang, Chen Change Loy

This paper focuses on long-tailed object detection in the semi-supervised learning setting, which poses realistic challenges, but has rarely been studied in the literature.

Long-tailed Object Detection Object +3

Towards Multi-Layered 3D Garments Animation

no code implementations ICCV 2023 Yidi Shao, Chen Change Loy, Bo Dai

In this paper, we propose a novel data-driven method, called LayersNet, to model garment-level animations as particle-wise interactions in a micro physics system.

Exploiting Diffusion Prior for Real-World Image Super-Resolution

3 code implementations11 May 2023 Jianyi Wang, Zongsheng Yue, Shangchen Zhou, Kelvin C. K. Chan, Chen Change Loy

We present a novel approach to leverage prior knowledge encapsulated in pre-trained text-to-image diffusion models for blind super-resolution (SR).

Blind Super-Resolution Image Super-Resolution

MIPI 2023 Challenge on RGBW Remosaic: Methods and Results

no code implementations20 Apr 2023 Qianhui Sun, Qingyu Yang, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Yuekun Dai, Wenxiu Sun, Qingpeng Zhu, Chen Change Loy, Jinwei Gu

Developing and integrating advanced image sensors with novel algorithms in camera systems are prevalent with the increasing demand for computational photography and imaging on mobile platforms.

SSIM

MIPI 2023 Challenge on RGBW Fusion: Methods and Results

no code implementations20 Apr 2023 Qianhui Sun, Qingyu Yang, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Yuekun Dai, Wenxiu Sun, Qingpeng Zhu, Chen Change Loy, Jinwei Gu

Developing and integrating advanced image sensors with novel algorithms in camera systems are prevalent with the increasing demand for computational photography and imaging on mobile platforms.

SSIM

Transformer-Based Visual Segmentation: A Survey

2 code implementations19 Apr 2023 Xiangtai Li, Henghui Ding, Haobo Yuan, Wenwei Zhang, Jiangmiao Pang, Guangliang Cheng, Kai Chen, Ziwei Liu, Chen Change Loy

Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various vision processing tasks.

Autonomous Driving Point Cloud Segmentation +1

Text2Performer: Text-Driven Human Video Generation

1 code implementation ICCV 2023 Yuming Jiang, Shuai Yang, Tong Liang Koh, Wayne Wu, Chen Change Loy, Ziwei Liu

In this work, we present Text2Performer to generate vivid human videos with articulated motions from texts.

Video Generation

Generating Aligned Pseudo-Supervision from Non-Aligned Data for Image Restoration in Under-Display Camera

1 code implementation CVPR 2023 Ruicheng Feng, Chongyi Li, Huaijin Chen, Shuai Li, Jinwei Gu, Chen Change Loy

Due to the difficulty in collecting large-scale and perfectly aligned paired training data for Under-Display Camera (UDC) image restoration, previous methods resort to monitor-based image systems or simulation-based methods, sacrificing the realness of the data and introducing domain gaps.

Image Restoration

Siamese DETR

1 code implementation CVPR 2023 Zeren Chen, Gengshi Huang, Wei Li, Jianing Teng, Kun Wang, Jing Shao, Chen Change Loy, Lu Sheng

In this work, we present Siamese DETR, a Siamese self-supervised pretraining approach for the Transformer architecture in DETR.

MULTI-VIEW LEARNING Representation Learning

Iterative Prompt Learning for Unsupervised Backlit Image Enhancement

no code implementations ICCV 2023 Zhexin Liang, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Chen Change Loy

To solve this issue, we devise a prompt learning framework that first learns an initial prompt pair by constraining the text-image similarity between the prompt (negative/positive sample) and the corresponding image (backlit image/well-lit image) in the CLIP latent space.

Image Enhancement Image Manipulation

SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis

no code implementations ICCV 2023 Guangcong Wang, Zhaoxi Chen, Chen Change Loy, Ziwei Liu

Since coarse depth maps are not strictly scaled to the ground-truth depth maps, we propose a simple yet effective constraint, a local depth ranking method, on NeRFs such that the expected depth ranking of the NeRF is consistent with that of the coarse depth maps in local patches.

Novel View Synthesis

Learning Generative Structure Prior for Blind Text Image Super-resolution

1 code implementation CVPR 2023 Xiaoming Li, WangMeng Zuo, Chen Change Loy

To restrict the generative space of StyleGAN so that it obeys the structure of characters yet remains flexible in handling different font styles, we store the discrete features for each character in a codebook.

Image Super-Resolution

CelebV-Text: A Large-Scale Facial Text-Video Dataset

1 code implementation CVPR 2023 Jianhui Yu, Hao Zhu, Liming Jiang, Chen Change Loy, Weidong Cai, Wayne Wu

This paper presents CelebV-Text, a large-scale, diverse, and high-quality dataset of facial text-video pairs, to facilitate research on facial text-to-video generation tasks.

Text Generation Text-to-Video Generation +1

Correlational Image Modeling for Self-Supervised Visual Pre-Training

1 code implementation CVPR 2023 Wei Li, Jiahao Xie, Chen Change Loy

We introduce Correlational Image Modeling (CIM), a novel and surprisingly effective approach to self-supervised visual pre-training.

Aligning Bag of Regions for Open-Vocabulary Object Detection

1 code implementation CVPR 2023 Size Wu, Wenwei Zhang, Sheng Jin, Wentao Liu, Chen Change Loy

The embeddings of regions in a bag are treated as embeddings of words in a sentence, and they are sent to the text encoder of a VLM to obtain the bag-of-regions embedding, which is learned to be aligned to the corresponding features extracted by a frozen VLM.

Ranked #7 on Open Vocabulary Object Detection on MSCOCO (using extra training data)

Object object-detection +2

Embedding Fourier for Ultra-High-Definition Low-Light Image Enhancement

no code implementations23 Feb 2023 Chongyi Li, Chun-Le Guo, Man Zhou, Zhexin Liang, Shangchen Zhou, Ruicheng Feng, Chen Change Loy

Our approach is motivated by a few unique characteristics in the Fourier domain: 1) most luminance information concentrates on amplitudes while noise is closely related to phases, and 2) a high-resolution image and its low-resolution version share similar amplitude patterns. Through embedding Fourier into our network, the amplitude and phase of a low-light image are separately processed to avoid amplifying noise when enhancing luminance.

Low-Light Image Enhancement Vocal Bursts Intensity Prediction

Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

2 code implementations ICCV 2023 Jianzong Wu, Xiangtai Li, Henghui Ding, Xia Li, Guangliang Cheng, Yunhai Tong, Chen Change Loy

Experiments on the COCO dataset with two settings: Open Vocabulary Instance Segmentation (OVIS) and Open Set Panoptic Segmentation (OSPS) demonstrate the superiority of the CGG.

Instance Segmentation Panoptic Segmentation +1

DeformToon3D: Deformable Neural Radiance Fields for 3D Toonification

no code implementations ICCV 2023 Junzhe Zhang, Yushi Lan, Shuai Yang, Fangzhou Hong, Quan Wang, Chai Kiat Yeo, Ziwei Liu, Chen Change Loy

In this paper, we address the challenging problem of 3D toonification, which involves transferring the style of an artistic domain onto a target 3D face with stylized geometry and texture.

Reference-based Image and Video Super-Resolution via C2-Matching

1 code implementation19 Dec 2022 Yuming Jiang, Kelvin C. K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu

To tackle these challenges, we propose C2-Matching in this work, which performs explicit robust matching crossing transformation and resolution.

Image Super-Resolution Reference-based Super-Resolution +2

Correspondence Distillation from NeRF-based GAN

no code implementations19 Dec 2022 Yushi Lan, Chen Change Loy, Bo Dai

The neural radiance field (NeRF) has shown promising results in preserving the fine details of objects and scenes.

Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion

no code implementations CVPR 2023 Yushi Lan, Xuyi Meng, Shuai Yang, Chen Change Loy, Bo Dai

In this paper, we study the challenging problem of 3D GAN inversion where a latent code is predicted given a single face image to faithfully recover its 3D shapes and detailed textures.

3D Face Reconstruction

DifFace: Blind Face Restoration with Diffused Error Contraction

2 code implementations13 Dec 2022 Zongsheng Yue, Chen Change Loy

Moreover, the transition distribution can contract the error of the restoration backbone and thus makes our method more robust to unknown degradations.

Blind Face Restoration

BeautyREC: Robust, Efficient, and Content-preserving Makeup Transfer

no code implementations12 Dec 2022 Qixin Yan, Chunle Guo, Jixin Zhao, Yuekun Dai, Chen Change Loy, Chongyi Li

The key insights of this study are modeling component-specific correspondence for local makeup transfer, capturing long-range dependencies for global makeup transfer, and enabling efficient makeup transfer via a single-path structure.

AnimeRun: 2D Animation Visual Correspondence from Open Source 3D Movies

1 code implementation10 Nov 2022 Li SiYao, Yuhang Li, Bo Li, Chao Dong, Ziwei Liu, Chen Change Loy

Existing correspondence datasets for two-dimensional (2D) cartoon suffer from simple frame composition and monotonic movements, making them insufficient to simulate real animations.

Optical Flow Estimation

Unified Vision and Language Prompt Learning

1 code implementation13 Oct 2022 Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy

Prompt tuning, a parameter- and data-efficient transfer learning paradigm that tunes only a small number of parameters in a model's input space, has become a trend in the vision community since the emergence of large vision-language models like CLIP.

Domain Generalization Few-Shot Learning +1

Flare7K: A Phenomenological Nighttime Flare Removal Dataset

1 code implementation12 Oct 2022 Yuekun Dai, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Chen Change Loy

In this paper, we introduce, Flare7K, the first nighttime flare removal dataset, which is generated based on the observation and statistics of real-world nighttime lens flares.

Flare Removal

Deep Fourier Up-Sampling

1 code implementation11 Oct 2022 Man Zhou, Hu Yu, Jie Huang, Feng Zhao, Jinwei Gu, Chen Change Loy, Deyu Meng, Chongyi Li

Existing convolutional neural networks widely adopt spatial down-/up-sampling for multi-scale modeling.

Image Dehazing Image Segmentation +4

VToonify: Controllable High-Resolution Portrait Video Style Transfer

1 code implementation22 Sep 2022 Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy

Although a series of successful portrait image toonification models built upon the powerful StyleGAN have been proposed, these image-oriented methods have obvious limitations when applied to videos, such as the fixed frame size, the requirement of face alignment, missing non-facial details and temporal inconsistency.

Face Alignment Style Transfer +2

On-Device Domain Generalization

2 code implementations15 Sep 2022 Kaiyang Zhou, Yuanhan Zhang, Yuhang Zang, Jingkang Yang, Chen Change Loy, Ziwei Liu

Another interesting observation is that the teacher-student gap on out-of-distribution data is bigger than that on in-distribution data, which highlights the capacity mismatch issue as well as the shortcoming of KD.

Data Augmentation Domain Generalization +2

Mind the Gap in Distilling StyleGANs

1 code implementation18 Aug 2022 Guodong Xu, Yuenan Hou, Ziwei Liu, Chen Change Loy

To further enhance the semantic consistency between the teacher and student model, we present a latent-direction-based distillation loss that preserves the semantic relations in latent space.

Knowledge Distillation

StyleLight: HDR Panorama Generation for Lighting Estimation and Editing

1 code implementation29 Jul 2022 Guangcong Wang, Yinuo Yang, Chen Change Loy, Ziwei Liu

To tackle this problem, we propose a coupled dual-StyleGAN panorama synthesis network (StyleLight) that integrates LDR and HDR panorama synthesis into a unified framework.

Lighting Estimation

GLEAN: Generative Latent Bank for Image Super-Resolution and Beyond

1 code implementation29 Jul 2022 Kelvin C. K. Chan, Xiangyu Xu, Xintao Wang, Jinwei Gu, Chen Change Loy

While most existing perceptual-oriented approaches attempt to generate realistic outputs through learning with adversarial loss, our method, Generative LatEnt bANk (GLEAN), goes beyond existing practices by directly leveraging rich and diverse priors encapsulated in a pre-trained GAN.

Colorization Image Colorization +2

CuDi: Curve Distillation for Efficient and Controllable Exposure Adjustment

no code implementations28 Jul 2022 Chongyi Li, Chunle Guo, Ruicheng Feng, Shangchen Zhou, Chen Change Loy

Our method inherits the zero-reference learning and curve-based framework from an effective low-light image enhancement method, Zero-DCE, with further speed up in its inference speed, reduction in its model size, and extension to controllable exposure adjustment.

Low-Light Image Enhancement

CelebV-HQ: A Large-Scale Video Facial Attributes Dataset

1 code implementation25 Jul 2022 Hao Zhu, Wayne Wu, Wentao Zhu, Liming Jiang, Siwei Tang, Li Zhang, Ziwei Liu, Chen Change Loy

Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields.

Attribute Face Generation +1

Transformer with Implicit Edges for Particle-based Physics Simulation

1 code implementation22 Jul 2022 Yidi Shao, Chen Change Loy, Bo Dai

Consequently, in this paper we propose a novel Transformer-based method, dubbed as Transformer with Implicit Edges (TIE), to capture the rich semantics of particle interactions in an edge-free manner.

Monocular 3D Object Reconstruction with GAN Inversion

1 code implementation20 Jul 2022 Junzhe Zhang, Daxuan Ren, Zhongang Cai, Chai Kiat Yeo, Bo Dai, Chen Change Loy

Reconstruction is achieved by searching for a latent space in the 3D GAN that best resembles the target mesh in accordance with the single view observation.

3D Object Reconstruction Object

BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis

1 code implementation20 Jul 2022 Davide Moltisanti, Jinyi Wu, Bo Dai, Chen Change Loy

Estimating human keypoints from these videos is difficult due to the complexity of the dance, as well as the multiple moving cameras recording setup.

Motion Synthesis Pose Estimation

Towards Robust Blind Face Restoration with Codebook Lookup Transformer

1 code implementation22 Jun 2022 Shangchen Zhou, Kelvin C. K. Chan, Chongyi Li, Chen Change Loy

In this paper, we demonstrate that a learned discrete codebook prior in a small proxy space largely reduces the uncertainty and ambiguity of restoration mapping by casting blind face restoration as a code prediction task, while providing rich visual atoms for generating high-quality faces.

Blind Face Restoration

Masked Frequency Modeling for Self-Supervised Visual Pre-Training

3 code implementations15 Jun 2022 Jiahao Xie, Wei Li, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy

We present Masked Frequency Modeling (MFM), a unified frequency-domain-based approach for self-supervised pre-training of visual models.

Image Classification Image Restoration +2

Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation

no code implementations CVPR 2022 Yuenan Hou, Xinge Zhu, Yuexin Ma, Chen Change Loy, Yikang Li

This article addresses the problem of distilling knowledge from a large teacher model to a slim student network for LiDAR semantic segmentation.

Ranked #7 on LIDAR Semantic Segmentation on nuScenes (val mIoU metric)

3D Semantic Segmentation Knowledge Distillation +1

Text2Human: Text-Driven Controllable Human Image Generation

2 code implementations31 May 2022 Yuming Jiang, Shuai Yang, Haonan Qiu, Wayne Wu, Chen Change Loy, Ziwei Liu

In this work, we present a text-driven controllable framework, Text2Human, for a high-quality and diverse human generation.

Human Parsing Image Generation

Delving into High-Quality Synthetic Face Occlusion Segmentation Datasets

3 code implementations12 May 2022 Kenny T. R. Voo, Liming Jiang, Chen Change Loy

This paper performs comprehensive analysis on datasets for occlusion-aware face segmentation, a task that is crucial for many downstream applications.

Segmentation Synthetic Data Generation +1

StyleGAN-Human: A Data-Centric Odyssey of Human Generation

4 code implementations25 Apr 2022 Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Chen Qian, Chen Change Loy, Wayne Wu, Ziwei Liu

In addition, a model zoo and human editing applications are demonstrated to facilitate future research in the community.

Image Generation

On the Generalization of BasicVSR++ to Video Deblurring and Denoising

1 code implementation11 Apr 2022 Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy

The exploitation of long-term information has been a long-standing problem in video restoration.

Deblurring Denoising +2

Unsupervised Image-to-Image Translation with Generative Prior

1 code implementation CVPR 2022 Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy

In this work, we present a novel framework, Generative Prior-guided UNsupervised Image-to-image Translation (GP-UNIT), to improve the overall quality and applicability of the translation algorithm.

Translation Unsupervised Image-To-Image Translation

Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory

1 code implementation CVPR 2022 Li SiYao, Weijiang Yu, Tianpei Gu, Chunze Lin, Quan Wang, Chen Qian, Chen Change Loy, Ziwei Liu

With the learned choreographic memory, dance generation is realized on the quantized units that meet high choreography standards, such that the generated dancing sequences are confined within the spatial constraints.

Motion Synthesis

Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer

1 code implementation CVPR 2022 Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy

Recent studies on StyleGAN show high performance on artistic portrait generation by transfer learning with limited data.

Style Transfer Transfer Learning +1

Open-Vocabulary DETR with Conditional Matching

1 code implementation22 Mar 2022 Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy

To this end, we propose a novel open-vocabulary detector based on DETR -- hence the name OV-DETR -- which, once trained, can detect any object given its class name or an exemplar image.

Language Modelling object-detection +1

Dense Siamese Network for Dense Unsupervised Learning

1 code implementation21 Mar 2022 Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy

It also extracts a batch of region embeddings that correspond to some sub-regions in the overlapped area to be contrasted for region consistency.

Self-Supervised Learning Unsupervised Semantic Segmentation

Conditional Prompt Learning for Vision-Language Models

7 code implementations CVPR 2022 Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu

With the rise of powerful pre-trained vision-language models like CLIP, it becomes essential to investigate ways to adapt these models to downstream datasets.

Domain Generalization Prompt Engineering

MoCaNet: Motion Retargeting in-the-wild via Canonicalization Networks

no code implementations19 Dec 2021 Wentao Zhu, Zhuoqian Yang, Ziang Di, Wayne Wu, Yizhou Wang, Chen Change Loy

Trained with the canonicalization operations and the derived regularizations, our method learns to factorize a skeleton sequence into three independent semantic subspaces, i. e., motion, structure, and view angle.

3D Reconstruction Action Analysis +2

Extract Free Dense Labels from CLIP

1 code implementation2 Dec 2021 Chong Zhou, Chen Change Loy, Bo Dai

Contrastive Language-Image Pre-training (CLIP) has made a remarkable breakthrough in open-vocabulary zero-shot image recognition.

Novel Concepts Open Vocabulary Panoptic Segmentation +5

Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data

2 code implementations NeurIPS 2021 Liming Jiang, Bo Dai, Wayne Wu, Chen Change Loy

Generative adversarial networks (GANs) typically require ample data for training in order to synthesize high-fidelity images.

Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements

no code implementations1 Nov 2021 Yu Rong, Jingbo Wang, Ziwei Liu, Chen Change Loy

In this paper, we make the first attempt to reconstruct 3D interacting hands from monocular single RGB images.

3D Reconstruction

A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis

1 code implementation NeurIPS 2021 Xingang Pan, Xudong Xu, Chen Change Loy, Christian Theobalt, Bo Dai

Motivated by the observation that a 3D object should look realistic from multiple viewpoints, these methods introduce a multi-view constraint as regularization to learn valid 3D radiance fields from 2D images.

3D-Aware Image Synthesis 3D Shape Reconstruction +2

Self-Supervised Representation Learning: Introduction, Advances and Challenges

no code implementations18 Oct 2021 Linus Ericsson, Henry Gouk, Chen Change Loy, Timothy M. Hospedales

Self-supervised representation learning methods aim to provide powerful deep feature learning without the requirement of large annotated datasets, thus alleviating the annotation bottleneck that is one of the main barriers to practical deployment of deep learning today.

Representation Learning

Playing for 3D Human Recovery

no code implementations14 Oct 2021 Zhongang Cai, Mingyuan Zhang, Jiawei Ren, Chen Wei, Daxuan Ren, Zhengyu Lin, Haiyu Zhao, Lei Yang, Chen Change Loy, Ziwei Liu

Specifically, we contribute GTA-Human, a large-scale 3D human dataset generated with the GTA-V game engine, featuring a highly diverse set of subjects, actions, and scenarios.

MeshInversion: 3D textured mesh reconstruction with generative prior

no code implementations29 Sep 2021 Junzhe Zhang, Daxuan Ren, Zhongang Cai, Chai Kiat Yeo, Bo Dai, Chen Change Loy

Reconstruction is achieved by searching for a latent space in the 3D GAN that best resembles the target mesh in accordance with the single view observation.

A Comprehensive Overhaul of Distilling Unconditional GANs

no code implementations29 Sep 2021 Guodong Xu, Yuenan Hou, Ziwei Liu, Chen Change Loy

To further enhance the semantic consistency between the teacher and student model, we present another latent-direction-based distillation loss that preserves the semantic relations in latent space.

Knowledge Distillation

SiT: Simulation Transformer for Particle-based Physics Simulation

no code implementations29 Sep 2021 Yidi Shao, Chen Change Loy, Bo Dai

However, they force particles to interact with all neighbors without selection, and they fall short in capturing material semantics for different particles, leading to unsatisfactory performance, especially in generalization.

ReconfigISP: Reconfigurable Camera Image Processing Pipeline

1 code implementation ICCV 2021 Ke Yu, Zexian Li, Yue Peng, Chen Change Loy, Jinwei Gu

Image Signal Processor (ISP) is a crucial component in digital cameras that transforms sensor signals into images for us to perceive and understand.

Image Restoration Neural Architecture Search +2

Talk-to-Edit: Fine-Grained Facial Editing via Dialog

1 code implementation ICCV 2021 Yuming Jiang, Ziqi Huang, Xingang Pan, Chen Change Loy, Ziwei Liu

In this work, we propose Talk-to-Edit, an interactive facial editing framework that performs fine-grained attribute manipulation through dialog between the user and the system.

Attribute Facial Editing +1

3D Human Texture Estimation from a Single Image with Transformers

1 code implementation ICCV 2021 Xiangyu Xu, Chen Change Loy

We propose a Transformer-based framework for 3D human texture estimation from a single image.

Garment Reconstruction

Learning to Prompt for Vision-Language Models

10 code implementations2 Sep 2021 Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu

Large pre-trained vision-language models like CLIP have shown great potential in learning representations that are transferable across a wide range of downstream tasks.

Domain Generalization Few-shot Age Estimation +2

K-Net: Towards Unified Image Segmentation

1 code implementation NeurIPS 2021 Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy

The framework, named K-Net, segments both instances and semantic categories consistently by a group of learnable kernels, where each kernel is responsible for generating a mask for either a potential instance or a stuff class.

Image Segmentation Instance Segmentation +2

Unsupervised Object-Level Representation Learning from Scene Images

1 code implementation NeurIPS 2021 Jiahao Xie, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy

Extensive experiments on COCO show that ORL significantly improves the performance of self-supervised learning on scene images, even surpassing supervised ImageNet pre-training on several downstream tasks.

Object Representation Learning +2

Pareidolia Face Reenactment

no code implementations CVPR 2021 Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He

We present a new application direction named Pareidolia Face Reenactment, which is defined as animating a static illusory face to move in tandem with a human face in the video.

Face Reenactment Texture Synthesis

Robust Reference-based Super-Resolution via C2-Matching

1 code implementation CVPR 2021 Yuming Jiang, Kelvin C. K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu

However, performing local transfer is difficult because of two gaps between input and reference images: the transformation gap (e. g. scale and rotation) and the resolution gap (e. g. HR and LR).

Reference-based Super-Resolution

Semi-Supervised Domain Generalization with Stochastic StyleMatch

2 code implementations1 Jun 2021 Kaiyang Zhou, Chen Change Loy, Ziwei Liu

We find that the DG methods, which by design are unable to handle unlabeled data, perform poorly with limited labels in SSDG; the SSL methods, especially FixMatch, obtain much better results but are still far away from the basic vanilla model trained using full labels.

Domain Generalization Semi-Supervised Domain Generalization

BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

3 code implementations CVPR 2022 Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy

We show that by empowering the recurrent framework with the enhanced propagation and alignment, one can exploit spatiotemporal information across misaligned video frames more effectively.

Analog Video Restoration Video Enhancement +1

Unsupervised 3D Shape Completion through GAN Inversion

no code implementations CVPR 2021 Junzhe Zhang, Xinyi Chen, Zhongang Cai, Liang Pan, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Bo Dai, Chen Change Loy

In contrast to previous fully supervised approaches, in this paper we present ShapeInversion, which introduces Generative Adversarial Network (GAN) inversion to shape completion for the first time.

Generative Adversarial Network valid

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation

1 code implementation CVPR 2021 Hang Zhou, Yasheng Sun, Wayne Wu, Chen Change Loy, Xiaogang Wang, Ziwei Liu

While speech content information can be defined by learning the intrinsic synchronization between audio-visual modalities, we identify that a pose code will be complementarily learned in a modulated convolution-based reconstruction framework.

Talking Face Generation

Low-Light Image and Video Enhancement Using Deep Learning: A Survey

3 code implementations21 Apr 2021 Chongyi Li, Chunle Guo, Linghao Han, Jun Jiang, Ming-Ming Cheng, Jinwei Gu, Chen Change Loy

Low-light image enhancement (LLIE) aims at improving the perception or interpretability of an image captured in an environment with poor illumination.

Face Detection Low-Light Image Enhancement +1

Removing Diffraction Image Artifacts in Under-Display Camera via Dynamic Skip Connection Network

1 code implementation CVPR 2021 Ruicheng Feng, Chongyi Li, Huaijin Chen, Shuai Li, Chen Change Loy, Jinwei Gu

Recent development of Under-Display Camera (UDC) systems provides a true bezel-less and notch-free viewing experience on smartphones (and TV, laptops, tablets), while allowing images to be captured from the selfie camera embedded underneath.

Image Restoration

Audio-Driven Emotional Video Portraits

1 code implementation CVPR 2021 Xinya Ji, Hang Zhou, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu

In this work, we present Emotional Video Portraits (EVP), a system for synthesizing high-quality video portraits with vivid emotional dynamics driven by audios.

Disentanglement Face Generation

Everything's Talkin': Pareidolia Face Reenactment

1 code implementation7 Apr 2021 Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He

We present a new application direction named Pareidolia Face Reenactment, which is defined as animating a static illusory face to move in tandem with a human face in the video.

Face Reenactment Texture Synthesis

Deep Animation Video Interpolation in the Wild

1 code implementation CVPR 2021 Li SiYao, Shiyu Zhao, Weijiang Yu, Wenxiu Sun, Dimitris N. Metaxas, Chen Change Loy, Ziwei Liu

In the animation industry, cartoon videos are usually produced at low frame rate since hand drawing of such frames is costly and time-consuming.

Optical Flow Estimation Video Frame Interpolation

Domain Generalization: A Survey

2 code implementations3 Mar 2021 Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tao Xiang, Chen Change Loy

Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce.

Action Recognition Data Augmentation +8

Network Pruning via Resource Reallocation

1 code implementation2 Mar 2021 Yuenan Hou, Zheng Ma, Chunxiao Liu, Zhe Wang, Chen Change Loy

Channel pruning is broadly recognized as an effective approach to obtain a small compact model through eliminating unimportant channels from a large cumbersome network.

Network Pruning

Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation

4 code implementations1 Mar 2021 Chongyi Li, Chunle Guo, Chen Change Loy

This paper presents a novel method, Zero-Reference Deep Curve Estimation (Zero-DCE), which formulates light enhancement as a task of image-specific curve estimation with a deep network.

Face Detection Image Enhancement

FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation

1 code implementation ICCV 2021 Yuhang Zang, Chen Huang, Chen Change Loy

We propose a simple yet effective method, Feature Augmentation and Sampling Adaptation (FASA), that addresses the data scarcity issue by augmenting the feature space especially for rare classes.

Instance Segmentation Segmentation +2

Chasing the Tail in Monocular 3D Human Reconstruction with Prototype Memory

no code implementations29 Dec 2020 Yu Rong, Ziwei Liu, Chen Change Loy

The reason is that most of the current models perform regression based on a single human prototype, which is similar to common poses while far from the rare poses.

3D Human Reconstruction regression

Exploring Data Augmentation for Multi-Modality 3D Object Detection

8 code implementations23 Dec 2020 Wenwei Zhang, Zhe Wang, Chen Change Loy

Due to the fact that multi-modality data augmentation must maintain consistency between point cloud and images, recent methods in this field typically use relatively insufficient data augmentation.

3D Object Detection Autonomous Driving +3

Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup

1 code implementation17 Dec 2020 Guodong Xu, Ziwei Liu, Chen Change Loy

Our goal is to achieve a performance comparable to conventional knowledge distillation with a lower computation cost during training.

Informativeness Knowledge Distillation +2

Positional Encoding as Spatial Inductive Bias in GANs

no code implementations CVPR 2021 Rui Xu, Xintao Wang, Kai Chen, Bolei Zhou, Chen Change Loy

In this work, taking SinGAN and StyleGAN2 as examples, we show that such capability, to a large extent, is brought by the implicit positional encoding when using zero padding in the generators.

Image Manipulation Inductive Bias +1

CARAFE++: Unified Content-Aware ReAssembly of FEatures

no code implementations7 Dec 2020 Jiaqi Wang, Kai Chen, Rui Xu, Ziwei Liu, Chen Change Loy, Dahua Lin

Feature reassembly, i. e. feature downsampling and upsampling, is a key operation in a number of modern convolutional network architectures, e. g., residual networks and feature pyramids.

Image Inpainting Instance Segmentation +3

GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution

no code implementations CVPR 2021 Kelvin C. K. Chan, Xintao Wang, Xiangyu Xu, Jinwei Gu, Chen Change Loy

We show that pre-trained Generative Adversarial Networks (GANs), e. g., StyleGAN, can be used as a latent bank to improve the restoration quality of large-factor image super-resolution (SR).

Image Super-Resolution

Do 2D GANs Know 3D Shape? Unsupervised 3D shape reconstruction from 2D Image GANs

1 code implementation ICLR 2021 Xingang Pan, Bo Dai, Ziwei Liu, Chen Change Loy, Ping Luo

Through our investigation, we found that such a pre-trained GAN indeed contains rich 3D knowledge and thus can be used to recover 3D shape from a single 2D image in an unsupervised manner.

3D Shape Reconstruction Object

Flexible Piecewise Curves Estimation for Photo Enhancement

no code implementations26 Oct 2020 Chongyi Li, Chunle Guo, Qiming Ai, Shangchen Zhou, Chen Change Loy

This paper presents a new method, called FlexiCurve, for photo enhancement.

Texture Memory-Augmented Deep Patch-Based Image Inpainting

1 code implementation28 Sep 2020 Rui Xu, Minghao Guo, Jiaqi Wang, Xiaoxiao Li, Bolei Zhou, Chen Change Loy

By bringing together the best of both paradigms, we propose a new deep inpainting framework where texture generation is guided by a texture memory of patch samples extracted from unmasked regions.

Image Inpainting Retrieval +1

Understanding Deformable Alignment in Video Super-Resolution

no code implementations15 Sep 2020 Kelvin C. K. Chan, Xintao Wang, Ke Yu, Chao Dong, Chen Change Loy

Aside from the contributions to deformable alignment, our formulation inspires a more flexible approach to introduce offset diversity to flow-based alignment, improving its performance.

Optical Flow Estimation Video Super-Resolution

Delving into Inter-Image Invariance for Unsupervised Visual Representations

2 code implementations26 Aug 2020 Jiahao Xie, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy

In this work, we present a comprehensive empirical study to better understand the role of inter-image invariance learning from three main constituting components: pseudo-label maintenance, sampling strategy, and decision boundary design.

Contrastive Learning Pseudo Label +1

MessyTable: Instance Association in Multiple Camera Views

no code implementations ECCV 2020 Zhongang Cai, Junzhe Zhang, Daxuan Ren, Cunjun Yu, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Chen Change Loy

We present an interesting and challenging dataset that features a large number of scenes with messy tables captured from multiple camera views.

Cross-Scale Internal Graph Neural Network for Image Super-Resolution

1 code implementation NeurIPS 2020 Shangchen Zhou, Jiawei Zhang, WangMeng Zuo, Chen Change Loy

Specifically, we dynamically construct a cross-scale graph by searching k-nearest neighboring patches in the downsampled LR image for each query patch in the LR image.

Image Restoration Image Super-Resolution

Knowledge Distillation Meets Self-Supervision

2 code implementations ECCV 2020 Guodong Xu, Ziwei Liu, Xiaoxiao Li, Chen Change Loy

Knowledge distillation, which involves extracting the "dark knowledge" from a teacher network to guide the learning of a student network, has emerged as an important technique for model compression and transfer learning.

Contrastive Learning Knowledge Distillation +2

Inter-Region Affinity Distillation for Road Marking Segmentation

1 code implementation CVPR 2020 Yuenan Hou, Zheng Ma, Chunxiao Liu, Tak-Wai Hui, Chen Change Loy

We study the problem of distilling knowledge from a large deep teacher network to a much smaller student network for the task of road marking segmentation.

Knowledge Distillation Lane Detection +1

Feature Pyramid Grids

1 code implementation7 Apr 2020 Kai Chen, Yuhang Cao, Chen Change Loy, Dahua Lin, Christoph Feichtenhofer

Feature pyramid networks have been widely adopted in the object detection literature to improve feature representations for better handling of variations in scale.

Neural Architecture Search object-detection +2

Self-Supervised Scene De-occlusion

2 code implementations CVPR 2020 Xiaohang Zhan, Xingang Pan, Bo Dai, Ziwei Liu, Dahua Lin, Chen Change Loy

This is achieved via Partial Completion Network (PCNet)-mask (M) and -content (C), that learn to recover fractions of object masks and contents, respectively, in a self-supervised manner.

Image Manipulation Scene Understanding

Learning to Cluster Faces via Confidence and Connectivity Estimation

3 code implementations CVPR 2020 Lei Yang, Dapeng Chen, Xiaohang Zhan, Rui Zhao, Chen Change Loy, Dahua Lin

With the vertex confidence and edge connectivity, we can naturally organize more relevant vertices on the affinity graph and group them into clusters.

Clustering Connectivity Estimation +2

TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting

no code implementations CVPR 2020 Zhuoqian Yang, Wentao Zhu, Wayne Wu, Chen Qian, Qiang Zhou, Bolei Zhou, Chen Change Loy

We present a lightweight video motion retargeting approach TransMoMo that is capable of transferring motion of a person in a source video realistically to another video of a target person.

motion retargeting

1st Place Solutions for OpenImage2019 -- Object Detection and Instance Segmentation

2 code implementations17 Mar 2020 Yu Liu, Guanglu Song, Yuhang Zang, Yan Gao, Enze Xie, Junjie Yan, Chen Change Loy, Xiaogang Wang

Given such good instance bounding box, we further design a simple instance-level semantic segmentation pipeline and achieve the 1st place on the segmentation challenge.

General Classification Instance Segmentation +6

Residual Knowledge Distillation

no code implementations21 Feb 2020 Mengya Gao, Yujun Shen, Quanquan Li, Chen Change Loy

Knowledge distillation (KD) is one of the most potent ways for model compression.

Knowledge Distillation Model Compression

Real or Not Real, that is the Question

2 code implementations ICLR 2020 Yuanbo Xiangli, Yubin Deng, Bo Dai, Chen Change Loy, Dahua Lin

While generative adversarial networks (GAN) have been widely adopted in various topics, in this paper we generalize the standard GAN to a new perspective by treating realness as a random variable that can be estimated from multiple angles.

Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement

9 code implementations CVPR 2020 Chunle Guo, Chongyi Li, Jichang Guo, Chen Change Loy, Junhui Hou, Sam Kwong, Runmin Cong

The paper presents a novel method, Zero-Reference Deep Curve Estimation (Zero-DCE), which formulates light enhancement as a task of image-specific curve estimation with a deep network.

Color Constancy Face Detection +1

Everybody's Talkin': Let Me Talk as You Want

no code implementations15 Jan 2020 Linsen Song, Wayne Wu, Chen Qian, Ran He, Chen Change Loy

The audio-translated expression parameters are then used to synthesize a photo-realistic human subject in each video frame, with the movement of the mouth regions precisely mapped to the source audio.

3D Face Reconstruction

EcoNAS: Finding Proxies for Economical Neural Architecture Search

no code implementations CVPR 2020 Dongzhan Zhou, Xinchi Zhou, Wenwei Zhang, Chen Change Loy, Shuai Yi, Xuesen Zhang, Wanli Ouyang

While many methods have been proposed to improve the efficiency of NAS, the search progress is still laborious because training and evaluating plausible architectures over large search space is time-consuming.

Neural Architecture Search

Side-Aware Boundary Localization for More Precise Object Detection

3 code implementations ECCV 2020 Jiaqi Wang, Wenwei Zhang, Yuhang Cao, Kai Chen, Jiangmiao Pang, Tao Gong, Jianping Shi, Chen Change Loy, Dahua Lin

To tackle the difficulty of precise localization in the presence of displacements with large variance, we further propose a two-step localization scheme, which first predicts a range of movement through bucket prediction and then pinpoints the precise position within the predicted bucket.

Object object-detection +2

Learning to Synthesize Fashion Textures

no code implementations18 Nov 2019 Wu Shi, Tak-Wai Hui, Ziwei Liu, Dahua Lin, Chen Change Loy

Another important observation is that fashion textures are multi-modal.

Robust Multi-Modality Multi-Object Tracking

1 code implementation ICCV 2019 Wenwei Zhang, Hui Zhou, Shuyang Sun, Zhe Wang, Jianping Shi, Chen Change Loy

Multi-sensor perception is crucial to ensure the reliability and accuracy in autonomous driving system, while multi-object tracking (MOT) improves that by tracing sequential movement of dynamic objects.

Autonomous Driving Multi-Object Tracking +2

Delving Deep Into Hybrid Annotations for 3D Human Recovery in the Wild

1 code implementation ICCV 2019 Yu Rong, Ziwei Liu, Cheng Li, Kaidi Cao, Chen Change Loy

Specifically, we focus on the challenging task of in-the-wild 3D human recovery from single images when paired 3D annotations are not fully available.

One-shot Face Reenactment

2 code implementations5 Aug 2019 Yunxuan Zhang, Siwei Zhang, Yue He, Cheng Li, Chen Change Loy, Ziwei Liu

However, in real-world scenario end-users often only have one target face at hand, rendering existing methods inapplicable.

Face Reconstruction Face Reenactment

Learning Lightweight Lane Detection CNNs by Self Attention Distillation

2 code implementations ICCV 2019 Yuenan Hou, Zheng Ma, Chunxiao Liu, Chen Change Loy

Training deep models for lane detection is challenging due to the very subtle and sparse supervisory signals inherent in lane annotations.

Knowledge Distillation Lane Detection +1

Disentangling Content and Style via Unsupervised Geometry Distillation

1 code implementation ICLR Workshop DeepGenStruct 2019 Wayne Wu, Kaidi Cao, Cheng Li, Chen Qian, Chen Change Loy

It is challenging to disentangle an object into two orthogonal spaces of content and style since each can influence the visual observation differently and unpredictably.

Disentanglement

Deep Flow-Guided Video Inpainting

2 code implementations CVPR 2019 Rui Xu, Xiaoxiao Li, Bolei Zhou, Chen Change Loy

Then the synthesized flow field is used to guide the propagation of pixels to fill up the missing regions in the video.

One-shot visual object segmentation Optical Flow Estimation +2

EDVR: Video Restoration with Enhanced Deformable Convolutional Networks

11 code implementations7 May 2019 Xintao Wang, Kelvin C. K. Chan, Ke Yu, Chao Dong, Chen Change Loy

In this work, we propose a novel Video Restoration framework with Enhanced Deformable networks, termed EDVR, to address these challenges.

Deblurring Video Enhancement +2

CARAFE: Content-Aware ReAssembly of FEatures

3 code implementations ICCV 2019 Jiaqi Wang, Kai Chen, Rui Xu, Ziwei Liu, Chen Change Loy, Dahua Lin

CARAFE introduces little computational overhead and can be readily integrated into modern network architectures.

Feature Upsampling Instance Segmentation +3

Path-Restore: Learning Network Path Selection for Image Restoration

1 code implementation23 Apr 2019 Ke Yu, Xintao Wang, Chao Dong, Xiaoou Tang, Chen Change Loy

To leverage this, we propose Path-Restore, a multi-path CNN with a pathfinder that can dynamically select an appropriate route for each image region.

Denoising Image Restoration +1

TransGaGa: Geometry-Aware Unsupervised Image-to-Image Translation

no code implementations CVPR 2019 Wayne Wu, Kaidi Cao, Cheng Li, Chen Qian, Chen Change Loy

Extensive experiments demonstrate the superior performance of our method to other state-of-the-art approaches, especially in the challenging near-rigid and non-rigid objects translation tasks.

Translation Unsupervised Image-To-Image Translation

Prime Sample Attention in Object Detection

1 code implementation CVPR 2020 Yuhang Cao, Kai Chen, Chen Change Loy, Dahua Lin

Our experiments demonstrate that it is often more effective to focus on prime samples than hard samples when training a detector.

Object object-detection +1

Learning to Cluster Faces on an Affinity Graph

3 code implementations CVPR 2019 Lei Yang, Xiaohang Zhan, Dapeng Chen, Junjie Yan, Chen Change Loy, Dahua Lin

Face recognition sees remarkable progress in recent years, and its performance has reached a very high level.

Clustering Face Recognition +1

Self-Supervised Learning via Conditional Motion Propagation

1 code implementation CVPR 2019 Xiaohang Zhan, Xingang Pan, Ziwei Liu, Dahua Lin, Chen Change Loy

Instead of explicitly modeling the motion probabilities, we design the pretext task as a conditional motion propagation problem.

Human Parsing Instance Segmentation +2

Dense Intrinsic Appearance Flow for Human Pose Transfer

1 code implementation CVPR 2019 Yining Li, Chen Huang, Chen Change Loy

Unlike existing methods, we propose to estimate dense and intrinsic 3D appearance flow to better guide the transfer of pixels between poses.

Pose Transfer

A Lightweight Optical Flow CNN -- Revisiting Data Fidelity and Regularization

3 code implementations15 Mar 2019 Tak-Wai Hui, Xiaoou Tang, Chen Change Loy

Over four decades, the majority addresses the problem of optical flow estimation using variational methods.

Optical Flow Estimation

Unsupervised Bi-directional Flow-based Video Generation from one Snapshot

no code implementations3 Mar 2019 Lu Sheng, Junting Pan, Jiaming Guo, Jing Shao, Xiaogang Wang, Chen Change Loy

Imagining multiple consecutive frames given one single snapshot is challenging, since it is difficult to simultaneously predict diverse motions from a single image and faithfully generate novel frames without visual distortions.

Video Generation

Hybrid Task Cascade for Instance Segmentation

5 code implementations CVPR 2019 Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin

In exploring a more effective approach, we find that the key to a successful instance segmentation cascade is to fully leverage the reciprocal relationship between detection and segmentation.

Instance Segmentation object-detection +4

Region Proposal by Guided Anchoring

1 code implementation CVPR 2019 Jiaqi Wang, Kai Chen, Shuo Yang, Chen Change Loy, Dahua Lin

State-of-the-art detectors mostly rely on a dense anchoring scheme, where anchors are sampled uniformly over the spatial domain with a predefined set of scales and aspect ratios.

object-detection Object Detection +1

An Embarrassingly Simple Approach for Knowledge Distillation

1 code implementation5 Dec 2018 Mengya Gao, Yujun Shen, Quanquan Li, Junjie Yan, Liang Wan, Dahua Lin, Chen Change Loy, Xiaoou Tang

Knowledge Distillation (KD) aims at improving the performance of a low-capacity student model by inheriting knowledge from a high-capacity teacher model.

Face Recognition Knowledge Distillation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.