Search Results for author: Ziwei Liu

Found 265 papers, 185 papers with code

Deep Learning Face Attributes in the Wild

2 code implementations • ICCV 2015 • Ziwei Liu, Ping Luo, Xiaogang Wang, Xiaoou Tang

LNet is pre-trained by massive general object categories for face localization, while ANet is pre-trained by massive face identities for attribute prediction.

Ranked #6 on Facial Attribute Classification on LFWA

Attribute Facial Attribute Classification

Paper
Code

Semantic Image Segmentation via Deep Parsing Network

no code implementations • ICCV 2015 • Ziwei Liu, Xiaoxiao Li, Ping Luo, Chen Change Loy, Xiaoou Tang

This paper addresses semantic image segmentation by incorporating rich information into Markov Random Field (MRF), including high-order relations and mixture of label contexts.

Ranked #89 on Semantic Segmentation on Cityscapes test

Image Segmentation Semantic Segmentation

Paper
Add Code

DeepFashion: Powering Robust Clothes Recognition and Retrieval With Rich Annotations

no code implementations • CVPR 2016 • Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, Xiaoou Tang

To demonstrate the advantages of DeepFashion, we propose a new deep model, namely FashionNet, which learns clothing features by jointly predicting clothing attributes and landmarks.

Retrieval

Paper
Add Code

Deep Learning Markov Random Field for Semantic Segmentation

no code implementations • 23 Jun 2016 • Ziwei Liu, Xiaoxiao Li, Ping Luo, Chen Change Loy, Xiaoou Tang

Semantic segmentation tasks can be well modeled by Markov Random Field (MRF).

Segmentation Semantic Segmentation +2

Paper
Add Code

Fashion Landmark Detection in the Wild

4 code implementations • 10 Aug 2016 • Ziwei Liu, Sijie Yan, Ping Luo, Xiaogang Wang, Xiaoou Tang

Fashion landmark is also compared to clothing bounding boxes and human joints in two applications, fashion attribute prediction and clothes retrieval, showing that fashion landmark is a more discriminative representation to understand fashion images.

Attribute Pose Estimation +1

348

Paper
Code

Semantic Facial Expression Editing using Autoencoded Flow

no code implementations • 30 Nov 2016 • Raymond Yeh, Ziwei Liu, Dan B. Goldman, Aseem Agarwala

High-level manipulation of facial expressions in images --- such as changing a smile to a neutral expression --- is challenging because facial expression changes are highly non-linear, and vary depending on the appearance of the face.

Paper
Add Code

Video Frame Synthesis using Deep Voxel Flow

3 code implementations • ICCV 2017 • Ziwei Liu, Raymond A. Yeh, Xiaoou Tang, Yiming Liu, Aseem Agarwala

We combine the advantages of these two methods by training a deep network that learns to synthesize video frames by flowing pixel values from existing ones, which we call deep voxel flow.

Ranked #2 on Video Prediction on DAVIS 2017

Optical Flow Estimation Video Prediction

213

Paper
Code

Not All Pixels Are Equal: Difficulty-aware Semantic Segmentation via Deep Layer Cascade

1 code implementation • CVPR 2017 • Xiaoxiao Li, Ziwei Liu, Ping Luo, Chen Change Loy, Xiaoou Tang

Third, in comparison to MC, LC is an end-to-end trainable framework, allowing joint learning of all sub-models.

Ranked #22 on Semantic Segmentation on PASCAL VOC 2012 test

Semantic Segmentation

108

Paper
Code

Video Object Segmentation with Re-identification

3 code implementations • 1 Aug 2017 • Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi, Ping Luo, Xiaoou Tang, Chen Change Loy

Specifically, our Video Object Segmentation with Re-identification (VS-ReID) model includes a mask propagation module and a ReID module.

Object Segmentation +4

289

Paper
Code

Unconstrained Fashion Landmark Detection via Hierarchical Recurrent Transformer Networks

2 code implementations • 7 Aug 2017 • Sijie Yan, Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, Xiaoou Tang

This work addresses unconstrained fashion landmark detection, where clothing bounding boxes are not provided in both training and test.

Paper
Code

Mix-and-Match Tuning for Self-Supervised Semantic Segmentation

no code implementations • 2 Dec 2017 • Xiaohang Zhan, Ziwei Liu, Ping Luo, Xiaoou Tang, Chen Change Loy

The key of this new form of learning is to design a proxy task (e. g. image colorization), from which a discriminative loss can be formulated on unlabeled data.

Colorization Image Colorization +3

Paper
Add Code

Dynamic Graph CNN for Learning on Point Clouds

18 code implementations • 24 Jan 2018 • Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon

Point clouds provide a flexible geometric representation suitable for countless applications in computer graphics; they also comprise the raw output of most 3D data acquisition devices.

Ranked #6 on Point Cloud Segmentation on PointCloud-C

3D Part Segmentation 3D Semantic Segmentation +3

1,572

Paper
Code

Adaptive Affinity Fields for Semantic Segmentation

1 code implementation • ECCV 2018 • Tsung-Wei Ke, Jyh-Jing Hwang, Ziwei Liu, Stella X. Yu

Semantic segmentation has made much progress with increasingly powerful pixel-wise classifiers and incorporating structural priors via Conditional Random Fields (CRF) or Generative Adversarial Networks (GAN).

Ranked #55 on Semantic Segmentation on Cityscapes test

Segmentation Semantic Segmentation

260

Paper
Code

Im2Avatar: Colorful 3D Reconstruction from a Single Image

1 code implementation • 17 Apr 2018 • Yongbin Sun, Ziwei Liu, Yue Wang, Sanjay E. Sarma

In this work, we study a new problem, that is, simultaneously recovering 3D shape and surface color from a single image, namely "colorful 3D reconstruction".

3D Reconstruction Hallucination

135

Paper
Code

DPatch: An Adversarial Patch Attack on Object Detectors

1 code implementation • 5 Jun 2018 • Xin Liu, Huanrui Yang, Ziwei Liu, Linghao Song, Hai Li, Yiran Chen

Successful realization of DPatch also illustrates the intrinsic vulnerability of the modern detector architectures to such patch-based adversarial attacks.

Object

129

Paper
Code

Talking Face Generation by Adversarially Disentangled Audio-Visual Representation

1 code implementation • 20 Jul 2018 • Hang Zhou, Yu Liu, Ziwei Liu, Ping Luo, Xiaogang Wang

Talking face generation aims to synthesize a sequence of face images that correspond to a clip of speech.

Lip Reading Retrieval +2

812

Paper
Code

Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition

6 code implementations • ECCV 2018 • Xiaohang Zhan, Ziwei Liu, Junjie Yan, Dahua Lin, Chen Change Loy

Face recognition has witnessed great progress in recent years, mainly attributed to the high-capacity model designed and the abundant labeled data collected.

Face Recognition

453

Paper
Code

PointGrow: Autoregressively Learned Point Cloud Generation with Self-Attention

1 code implementation • 12 Oct 2018 • Yongbin Sun, Yue Wang, Ziwei Liu, Joshua E. Siegel, Sanjay E. Sarma

Generating 3D point clouds is challenging yet highly desired.

Generating 3D Point Clouds Point Cloud Generation

Paper
Code

Instance-level Facial Attributes Transfer with Geometry-Aware Flow

no code implementations • 30 Nov 2018 • Weidong Yin, Ziwei Liu, Chen Change Loy

Geometry-aware flow is able to warp the source face attribute into the target face context and generate a warp-and-blend result.

Attribute Hallucination

Paper
Add Code

Hybrid Task Cascade for Instance Segmentation

5 code implementations • CVPR 2019 • Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin

In exploring a more effective approach, we find that the key to a successful instance segmentation cascade is to fully leverage the reciprocal relationship between detection and segmentation.

Ranked #32 on Object Detection on COCO-O

Instance Segmentation object-detection +4

27,790

Paper
Code

Self-Supervised Learning via Conditional Motion Propagation

1 code implementation • CVPR 2019 • Xiaohang Zhan, Xingang Pan, Ziwei Liu, Dahua Lin, Chen Change Loy

Instead of explicitly modeling the motion probabilities, we design the pretext task as a conditional motion propagation problem.

Human Parsing Instance Segmentation +2

137

Paper
Code

Large-Scale Long-Tailed Recognition in an Open World

2 code implementations • CVPR 2019 • Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, Stella X. Yu

We define Open Long-Tailed Recognition (OLTR) as learning from such naturally distributed data and optimizing the classification accuracy over a balanced test set which include head, tail, and open classes.

Ranked #5 on Long-tail learning with class descriptors on ImageNet-LT-d

Classification Few-Shot Learning +4

826

Paper
Code

CARAFE: Content-Aware ReAssembly of FEatures

3 code implementations • ICCV 2019 • Jiaqi Wang, Kai Chen, Rui Xu, Ziwei Liu, Chen Change Loy, Dahua Lin

CARAFE introduces little computational overhead and can be readily integrated into modern network architectures.

Ranked #3 on Feature Upsampling on ImageNet

Feature Upsampling Instance Segmentation +3

27,790

Paper
Code

MMDetection: Open MMLab Detection Toolbox and Benchmark

144 code implementations • 17 Jun 2019 • Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, Zheng Zhang, Dazhi Cheng, Chenchen Zhu, Tianheng Cheng, Qijie Zhao, Buyu Li, Xin Lu, Rui Zhu, Yue Wu, Jifeng Dai, Jingdong Wang, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin

In this paper, we introduce the various features of this toolbox.

Benchmarking Instance Segmentation +2

27,790

Paper
Code

MaskGAN: Towards Diverse and Interactive Facial Image Manipulation

7 code implementations • CVPR 2020 • Cheng-Han Lee, Ziwei Liu, Lingyun Wu, Ping Luo

To overcome these drawbacks, we propose a novel framework termed MaskGAN, enabling diverse and interactive face manipulation.

Attribute Image Manipulation

2,008

Paper
Code

One-shot Face Reenactment

2 code implementations • 5 Aug 2019 • Yunxuan Zhang, Siwei Zhang, Yue He, Cheng Li, Chen Change Loy, Ziwei Liu

However, in real-world scenario end-users often only have one target face at hand, rendering existing methods inapplicable.

Face Reconstruction Face Reenactment

190

Paper
Code

Delving Deep Into Hybrid Annotations for 3D Human Recovery in the Wild

1 code implementation • ICCV 2019 • Yu Rong, Ziwei Liu, Cheng Li, Kaidi Cao, Chen Change Loy

Specifically, we focus on the challenging task of in-the-wild 3D human recovery from single images when paired 3D annotations are not fully available.

Paper
Code

Open Compound Domain Adaptation

no code implementations • CVPR 2020 • Ziwei Liu, Zhongqi Miao, Xingang Pan, Xiaohang Zhan, Dahua Lin, Stella X. Yu, Boqing Gong

A typical domain adaptation approach is to adapt models trained on the annotated data in a source domain (e. g., sunny weather) for achieving high performance on the test data in a target domain (e. g., rainy weather).

Domain Adaptation Facial Expression Recognition +2

Paper
Add Code

Vision-Infused Deep Audio Inpainting

no code implementations • ICCV 2019 • Hang Zhou, Ziwei Liu, Xudong Xu, Ping Luo, Xiaogang Wang

Extensive experiments demonstrate that our framework is capable of inpainting realistic and varying audio segments with or without visual contexts.

Audio inpainting Image Inpainting

Paper
Add Code

Learning to Synthesize Fashion Textures

no code implementations • 18 Nov 2019 • Wu Shi, Tak-Wai Hui, Ziwei Liu, Dahua Lin, Chen Change Loy

Another important observation is that fashion textures are multi-modal.

Paper
Add Code

When NAS Meets Robustness: In Search of Robust Architectures against Adversarial Attacks

1 code implementation • CVPR 2020 • Minghao Guo, Yuzhe Yang, Rui Xu, Ziwei Liu, Dahua Lin

Recent advances in adversarial attacks uncover the intrinsic vulnerability of modern deep neural networks.

Neural Architecture Search

123

Paper
Code

Learning Diverse Fashion Collocation by Neural Graph Filtering

no code implementations • 11 Mar 2020 • Xin Liu, Yongbin Sun, Ziwei Liu, Dahua Lin

To facilitate a comprehensive study on diverse fashion collocation, we reorganize Amazon Fashion dataset with carefully designed evaluation protocols.

Recommendation Systems

Paper
Add Code

Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images

1 code implementation • CVPR 2020 • Hang Zhou, Jihao Liu, Ziwei Liu, Yu Liu, Xiaogang Wang

Though face rotation has achieved rapid progress in recent years, the lack of high-quality paired training data remains a great hurdle for existing methods.

3D Face Modelling Data Augmentation +1

482

Paper
Code

Self-Supervised Scene De-occlusion

2 code implementations • CVPR 2020 • Xiaohang Zhan, Xingang Pan, Bo Dai, Ziwei Liu, Dahua Lin, Chen Change Loy

This is achieved via Partial Completion Network (PCNet)-mask (M) and -content (C), that learn to recover fractions of object masks and contents, respectively, in a self-supervised manner.

Image Manipulation Scene Understanding

770

Paper
Code

MMFashion: An Open-Source Toolbox for Visual Fashion Analysis

3 code implementations • 18 May 2020 • Xin Liu, Jiancheng Li, Jiaqi Wang, Ziwei Liu

This toolbox supports a wide spectrum of fashion analysis tasks, including Fashion Attribute Prediction, Fashion Recognition and Retrieval, Fashion Landmark Detection, Fashion Parsing and Segmentation and Fashion Compatibility and Recommendation.

Attribute Retrieval

1,206

Paper
Code

Knowledge Distillation Meets Self-Supervision

2 code implementations • ECCV 2020 • Guodong Xu, Ziwei Liu, Xiaoxiao Li, Chen Change Loy

Knowledge distillation, which involves extracting the "dark knowledge" from a teacher network to guide the learning of a student network, has emerged as an important technique for model compression and transfer learning.

Ranked #32 on Knowledge Distillation on ImageNet

Contrastive Learning Knowledge Distillation +2

1,270

Paper
Code

Online Deep Clustering for Unsupervised Representation Learning

1 code implementation • CVPR 2020 • Xiaohang Zhan, Jiahao Xie, Ziwei Liu, Yew Soon Ong, Chen Change Loy

In this way, labels and the network evolve shoulder-to-shoulder rather than alternatingly.

Clustering Deep Clustering +1

3,083

Paper
Code

Unsupervised Landmark Learning from Unpaired Data

1 code implementation • 29 Jun 2020 • Yinghao Xu, Ceyuan Yang, Ziwei Liu, Bo Dai, Bolei Zhou

Recent attempts for unsupervised landmark learning leverage synthesized image pairs that are similar in appearance but different in poses.

Paper
Code

Placepedia: Comprehensive Place Understanding with Multi-Faceted Annotations

no code implementations • ECCV 2020 • Huaiyi Huang, Yuqi Zhang, Qingqiu Huang, Zhengkui Guo, Ziwei Liu, Dahua Lin

Place is an important element in visual understanding.

Paper
Add Code

Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement

1 code implementation • ECCV 2020 • Qiang Nie, Ziwei Liu, Yun-hui Liu

Learning a good 3D human pose representation is important for human pose related tasks, e. g. human 3D pose estimation and action recognition.

3D Pose Estimation Action Recognition +2

Paper
Code

Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets

1 code implementation • ECCV 2020 • Tong Wu, Qingqiu Huang, Ziwei Liu, Yu Wang, Dahua Lin

We present a new loss function called Distribution-Balanced Loss for the multi-label recognition problems that exhibit long-tailed class distributions.

Ranked #7 on Long-tail Learning on VOC-MLT

Binary Classification General Classification +2

349

Paper
Code

Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation

no code implementations • ECCV 2020 • Hang Zhou, Xudong Xu, Dahua Lin, Xiaogang Wang, Ziwei Liu

Stereophonic audio is an indispensable ingredient to enhance human auditory experience.

Audio Generation

Paper
Add Code

CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations

1 code implementation • ECCV 2020 • Yuanhan Zhang, Zhenfei Yin, Yidong Li, Guojun Yin, Junjie Yan, Jing Shao, Ziwei Liu

The main reason is that current face anti-spoofing datasets are limited in both quantity and diversity.

Attribute Face Anti-Spoofing

512

Paper
Code

Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination

2 code implementations • CVPR 2021 • Xudong Wang, Ziwei Liu, Stella X. Yu

Unsupervised feature learning has made great strides with contrastive learning based on instance discrimination and invariant mapping, as benchmarked on curated class-balanced datasets.

Contrastive Learning Semi-Supervised Image Classification +2

Paper
Code

Seesaw Loss for Long-Tailed Instance Segmentation

2 code implementations • CVPR 2021 • Jiaqi Wang, Wenwei Zhang, Yuhang Zang, Yuhang Cao, Jiangmiao Pang, Tao Gong, Kai Chen, Ziwei Liu, Chen Change Loy, Dahua Lin

Instances of head classes dominate a long-tailed dataset and they serve as negative samples of tail categories.

Instance Segmentation Semantic Segmentation

27,790

Paper
Code

Delving into Inter-Image Invariance for Unsupervised Visual Representations

2 code implementations • 26 Aug 2020 • Jiahao Xie, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy

In this work, we present a comprehensive empirical study to better understand the role of inter-image invariance learning from three main constituting components: pseudo-label maintenance, sampling strategy, and decision boundary design.

Contrastive Learning Pseudo Label +1

3,083

Paper
Code

Person-in-Context Synthesiswith Compositional Structural Space

no code implementations • 28 Aug 2020 • Weidong Yin, Ziwei Liu, Leonid Sigal

To handle the stark difference in input structures, we proposed two separate neural branches to attentively composite the respective (context/person) inputs into shared ``compositional structural space'', which encodes shape, location and appearance information for both context and person structures in a disentangled manner.

Paper
Add Code

Long-tailed Recognition by Routing Diverse Distribution-Aware Experts

2 code implementations • ICLR 2021 • Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu, Stella X. Yu

We take a dynamic view of the training data and provide a principled model bias and variance analysis as the training data fluctuates: Existing long-tail classifiers invariably increase the model variance and the head-tail model bias gap remains large, due to more and larger confusion with hard negatives for the tail.

Ranked #22 on Long-tail Learning on iNaturalist 2018

Image Classification imbalanced classification +1

250

Paper
Code

Do 2D GANs Know 3D Shape? Unsupervised 3D shape reconstruction from 2D Image GANs

1 code implementation • ICLR 2021 • Xingang Pan, Bo Dai, Ziwei Liu, Chen Change Loy, Ping Luo

Through our investigation, we found that such a pre-trained GAN indeed contains rich 3D knowledge and thus can be used to recover 3D shape from a single 2D image in an unsupervised manner.

3D Shape Reconstruction Object

570

Paper
Code

LiDAR-based Panoptic Segmentation via Dynamic Shifting Network

1 code implementation • CVPR 2021 • Fangzhou Hong, Hui Zhou, Xinge Zhu, Hongsheng Li, Ziwei Liu

2) Dynamic Shifting for complex point distributions.

Ranked #2 on Panoptic Segmentation on SemanticKITTI

Autonomous Driving Clustering +1

230

Paper
Code

CARAFE++: Unified Content-Aware ReAssembly of FEatures

no code implementations • 7 Dec 2020 • Jiaqi Wang, Kai Chen, Rui Xu, Ziwei Liu, Chen Change Loy, Dahua Lin

Feature reassembly, i. e. feature downsampling and upsampling, is a key operation in a number of modern convolutional network architectures, e. g., residual networks and feature pyramids.

Image Inpainting Instance Segmentation +3

Paper
Add Code

Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup

1 code implementation • 17 Dec 2020 • Guodong Xu, Ziwei Liu, Chen Change Loy

Our goal is to achieve a performance comparable to conventional knowledge distillation with a lower computation cost during training.

Informativeness Knowledge Distillation +2

Paper
Code

ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on

1 code implementation • 18 Dec 2020 • Gaurav Kuppa, Andrew Jong, Vera Liu, Ziwei Liu, Teng-Sheng Moh

We build a series of scientific experiments to isolate effective design choices in video synthesis for virtual clothing try-on.

Neural Rendering Virtual Try-on

122

Paper
Code

Chasing the Tail in Monocular 3D Human Reconstruction with Prototype Memory

no code implementations • 29 Dec 2020 • Yu Rong, Ziwei Liu, Chen Change Loy

The reason is that most of the current models perform regression based on a single human prototype, which is similar to common poses while far from the rare poses.

3D Human Reconstruction regression

Paper
Add Code

BlockPlanner: City Block Generation With Vectorized Graph Representation

no code implementations • ICCV 2021 • Linning Xu, Yuanbo Xiangli, Anyi Rao, Nanxuan Zhao, Bo Dai, Ziwei Liu, Dahua Lin

City modeling is the foundation for computational urban planning, navigation, and entertainment.

valid

Paper
Add Code

Differentiable Dynamic Wirings for Neural Networks

no code implementations • ICCV 2021 • Kun Yuan, Quanquan Li, Shaopeng Guo, Dapeng Chen, Aojun Zhou, Fengwei Yu, Ziwei Liu

A standard practice of deploying deep neural networks is to apply the same architecture to all the input instances.

object-detection Object Detection

Paper
Add Code

DeeperForensics Challenge 2020 on Real-World Face Forgery Detection: Methods and Results

2 code implementations • 18 Feb 2021 • Liming Jiang, Zhengkui Guo, Wayne Wu, Zhaoyang Liu, Ziwei Liu, Chen Change Loy, Shuo Yang, Yuanjun Xiong, Wei Xia, Baoying Chen, Peiyu Zhuang, Sili Li, Shen Chen, Taiping Yao, Shouhong Ding, Jilin Li, Feiyue Huang, Liujuan Cao, Rongrong Ji, Changlei Lu, Ganchao Tan

This paper reports methods and results in the DeeperForensics Challenge 2020 on real-world face forgery detection.

valid

524

Paper
Code

CelebA-Spoof Challenge 2020 on Face Anti-Spoofing: Methods and Results

1 code implementation • 25 Feb 2021 • Yuanhan Zhang, Zhenfei Yin, Jing Shao, Ziwei Liu, Shuo Yang, Yuanjun Xiong, Wei Xia, Yan Xu, Man Luo, Jian Liu, Jianshu Li, Zhijun Chen, Mingyu Guo, Hui Li, Junfu Liu, Pengfei Gao, Tianqi Hong, Hao Han, Shijie Liu, Xinhua Chen, Di Qiu, Cheng Zhen, Dashuang Liang, Yufeng Jin, Zhanlong Hao

It is the largest face anti-spoofing dataset in terms of the numbers of the data and the subjects.

Face Anti-Spoofing valid

512

Paper
Code

Domain Generalization: A Survey

2 code implementations • 3 Mar 2021 • Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tao Xiang, Chen Change Loy

Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce.

Action Recognition Data Augmentation +8

1,082

Paper
Code

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

2 code implementations • CVPR 2021 • Yinan He, Bei Gan, Siyu Chen, Yichun Zhou, Guojun Yin, Luchuan Song, Lu Sheng, Jing Shao, Ziwei Liu

To counter this emerging threat, we construct the ForgeryNet dataset, an extremely large face forgery dataset with unified annotations in image- and video-level data across four tasks: 1) Image Forgery Classification, including two-way (real / fake), three-way (real / fake with identity-replaced forgery approaches / fake with identity-remained forgery approaches), and n-way (real and 15 respective forgery approaches) classification.

Benchmarking Classification +2

Paper
Code

Incorporating Convolution Designs into Visual Transformers

3 code implementations • ICCV 2021 • Kun Yuan, Shaopeng Guo, Ziwei Liu, Aojun Zhou, Fengwei Yu, Wei Wu

Motivated by the success of Transformers in natural language processing (NLP) tasks, there emerge some attempts (e. g., ViT and DeiT) to apply Transformers to the vision domain.

Ranked #2 on Image Classification on Oxford-IIIT Pets

Image Classification

Paper
Code

Adversarial Robustness under Long-Tailed Distribution

1 code implementation • CVPR 2021 • Tong Wu, Ziwei Liu, Qingqiu Huang, Yu Wang, Dahua Lin

We then perform a systematic study on existing long-tailed recognition methods in conjunction with the adversarial training framework.

Adversarial Robustness

Paper
Code

Deep Animation Video Interpolation in the Wild

1 code implementation • CVPR 2021 • Li SiYao, Shiyu Zhao, Weijiang Yu, Wenxiu Sun, Dimitris N. Metaxas, Chen Change Loy, Ziwei Liu

In the animation industry, cartoon videos are usually produced at low frame rate since hand drawing of such frames is costly and time-consuming.

Optical Flow Estimation Video Frame Interpolation

385

Paper
Code

Visually Informed Binaural Audio Generation without Binaural Audios

no code implementations • CVPR 2021 • Xudong Xu, Hang Zhou, Ziwei Liu, Bo Dai, Xiaogang Wang, Dahua Lin

Moreover, combined with binaural recordings, our method is able to further boost the performance of binaural audio generation under supervised settings.

Audio Generation

Paper
Add Code

Variational Relational Point Completion Network

1 code implementation • CVPR 2021 • Liang Pan, Xinyi Chen, Zhongang Cai, Junzhe Zhang, Haiyu Zhao, Shuai Yi, Ziwei Liu

In particular, we propose a dual-path architecture to enable principled probabilistic modeling across partial and complete clouds.

Ranked #2 on Point Cloud Completion on Completion3D

Point Cloud Completion

153

Paper
Code

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation

1 code implementation • CVPR 2021 • Hang Zhou, Yasheng Sun, Wayne Wu, Chen Change Loy, Xiaogang Wang, Ziwei Liu

While speech content information can be defined by learning the intrinsic synchronization between audio-visual modalities, we identify that a pose code will be complementarily learned in a modulated convolution-based reconstruction framework.

Talking Face Generation

904

Paper
Code

Iterative Human and Automated Identification of Wildlife Images

1 code implementation • 5 May 2021 • Zhongqi Miao, Ziwei Liu, Kaitlyn M. Gaynor, Meredith S. Palmer, Stella X. Yu, Wayne M. Getz

Camera trapping is increasingly used to monitor wildlife, but this technology typically requires extensive data annotation.

Paper
Code

Semi-Supervised Domain Generalization with Stochastic StyleMatch

2 code implementations • 1 Jun 2021 • Kaiyang Zhou, Chen Change Loy, Ziwei Liu

We find that the DG methods, which by design are unable to handle unlabeled data, perform poorly with limited labels in SSDG; the SSL methods, especially FixMatch, obtain much better results but are still far away from the basic vanilla model trained using full labels.

Domain Generalization Semi-Supervised Domain Generalization

1,082

Paper
Code

Robust Reference-based Super-Resolution via C2-Matching

1 code implementation • CVPR 2021 • Yuming Jiang, Kelvin C. K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu

However, performing local transfer is difficult because of two gaps between input and reference images: the transformation gap (e. g. scale and rotation) and the resolution gap (e. g. HR and LR).

Reference-based Super-Resolution

192

Paper
Code

Delving Deep into the Generalization of Vision Transformers under Distribution Shifts

1 code implementation • CVPR 2022 • Chongzhi Zhang, Mingyuan Zhang, Shanghang Zhang, Daisheng Jin, Qiang Zhou, Zhongang Cai, Haiyu Zhao, Xianglong Liu, Ziwei Liu

By comprehensively investigating these GE-ViTs and comparing with their corresponding CNN models, we observe: 1) For the enhanced model, larger ViTs still benefit more for the OOD generalization.

Out-of-Distribution Generalization Self-Supervised Learning

Paper
Code

Unsupervised Object-Level Representation Learning from Scene Images

1 code implementation • NeurIPS 2021 • Jiahao Xie, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy

Extensive experiments on COCO show that ORL significantly improves the performance of self-supervised learning on scene images, even surpassing supervised ImageNet pre-training on several downstream tasks.

Object Representation Learning +2

Paper
Code

Unsupervised Domain Adaptive 3D Detection with Multi-Level Consistency

1 code implementation • ICCV 2021 • Zhipeng Luo, Zhongang Cai, Changqing Zhou, Gongjie Zhang, Haiyu Zhao, Shuai Yi, Shijian Lu, Hongsheng Li, Shanghang Zhang, Ziwei Liu

In addition, existing 3D domain adaptive detection methods often assume prior access to the target domain annotations, which is rarely feasible in the real world.

3D Object Detection Autonomous Driving +1

Paper
Code

Energy-Based Open-World Uncertainty Modeling for Confidence Calibration

no code implementations • ICCV 2021 • Yezhen Wang, Bo Li, Tong Che, Kaiyang Zhou, Ziwei Liu, Dongsheng Li

Confidence calibration is of great importance to the reliability of decisions made by machine learning systems.

Paper
Add Code

Semantically Coherent Out-of-Distribution Detection

2 code implementations • ICCV 2021 • Jingkang Yang, Haoqi Wang, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang, Ziwei Liu

The proposed UDG can not only enrich the semantic knowledge of the model by exploiting unlabeled data in an unsupervised manner, but also distinguish ID/OOD samples to enhance ID classification and OOD detection tasks simultaneously.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Paper
Code

Learning to Prompt for Vision-Language Models

13 code implementations • 2 Sep 2021 • Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu

Large pre-trained vision-language models like CLIP have shown great potential in learning representations that are transferable across a wide range of downstream tasks.

Ranked #2 on Few-shot Age Estimation on MORPH Album2

Domain Generalization Few-shot Age Estimation +2

1,481

Paper
Code

Talk-to-Edit: Fine-Grained Facial Editing via Dialog

1 code implementation • ICCV 2021 • Yuming Jiang, Ziqi Huang, Xingang Pan, Chen Change Loy, Ziwei Liu

In this work, we propose Talk-to-Edit, an interactive facial editing framework that performs fine-grained attribute manipulation through dialog between the user and the system.

Ranked #1 on Fine-Grained Facial Editing on CelebA-Dialog

Attribute Facial Editing +1

302

Paper
Code

Bayesian Imbalanced Regression Debiasing

no code implementations • 29 Sep 2021 • Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu

Compared to imbalanced and long-tailed classification, imbalanced regression has its unique challenges as the regression label space can be continuous, boundless, and high-dimensional.

Age Estimation imbalanced classification +2

Paper
Add Code

A Comprehensive Overhaul of Distilling Unconditional GANs

no code implementations • 29 Sep 2021 • Guodong Xu, Yuenan Hou, Ziwei Liu, Chen Change Loy

To further enhance the semantic consistency between the teacher and student model, we present another latent-direction-based distillation loss that preserves the semantic relations in latent space.

Knowledge Distillation

Paper
Add Code

TAda! Temporally-Adaptive Convolutions for Video Understanding

2 code implementations • ICLR 2022 • Ziyuan Huang, Shiwei Zhang, Liang Pan, Zhiwu Qing, Mingqian Tang, Ziwei Liu, Marcelo H. Ang Jr

This work presents Temporally-Adaptive Convolutions (TAdaConv) for video understanding, which shows that adaptive weight calibration along the temporal dimension is an efficient way to facilitate modelling complex temporal dynamics in videos.

Ranked #67 on Action Recognition on Something-Something V2 (using extra training data)

Action Classification Action Recognition +2

215

Paper
Code

Playing for 3D Human Recovery

no code implementations • 14 Oct 2021 • Zhongang Cai, Mingyuan Zhang, Jiawei Ren, Chen Wei, Daxuan Ren, Zhengyu Lin, Haiyu Zhao, Lei Yang, Chen Change Loy, Ziwei Liu

Specifically, we contribute GTA-Human, a large-scale 3D human dataset generated with the GTA-V game engine, featuring a highly diverse set of subjects, actions, and scenarios.

Paper
Add Code

Generalized Out-of-Distribution Detection: A Survey

3 code implementations • 21 Oct 2021 • Jingkang Yang, Kaiyang Zhou, Yixuan Li, Ziwei Liu

In this survey, we first present a unified framework called generalized OOD detection, which encompasses the five aforementioned problems, i. e., AD, ND, OSR, OOD detection, and OD.

Anomaly Detection Autonomous Driving +5

749

Paper
Code

Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements

no code implementations • 1 Nov 2021 • Yu Rong, Jingbo Wang, Ziwei Liu, Chen Change Loy

In this paper, we make the first attempt to reconstruct 3D interacting hands from monocular single RGB images.

3D Reconstruction

Paper
Add Code

Few-Shot Object Detection via Association and DIscrimination

1 code implementation • NeurIPS 2021 • Yuhang Cao, Jiaqi Wang, Ying Jin, Tong Wu, Kai Chen, Ziwei Liu, Dahua Lin

1) In the association step, in contrast to implicitly leveraging multiple base classes, we construct a compact novel class feature space via explicitly imitating a specific base class feature space.

Few-Shot Object Detection Object +3

Paper
Code

Lifting 2D Human Pose to 3D with Domain Adapted 3D Body Concept

no code implementations • 23 Nov 2021 • Qiang Nie, Ziwei Liu, Yunhui Liu

Inspired by this, we propose a new framework that leverages the labeled 3D human poses to learn a 3D concept of the human body to reduce the ambiguity.

3D Pose Estimation Domain Adaptation

Paper
Add Code

Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion

1 code implementation • 24 Nov 2021 • Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu, Dahua Lin

We adopt DCD to evaluate the point cloud completion task, where experimental results show that DCD pays attention to both the overall structure and local geometric details and provides a more reliable evaluation even when CD and EMD contradict each other.

Point Cloud Completion

133

Paper
Code

Robust Partial-to-Partial Point Cloud Registration in a Full Range

1 code implementation • 30 Nov 2021 • Liang Pan, Zhongang Cai, Ziwei Liu

\textbf{3)} Based on a synergy of hierarchical graph networks and graphical modeling, we propose the {H}ierarchical {G}raphical {M}odeling (\textbf{HGM}) architecture to encode robust descriptors consisting of i) a unary term learned from {\textit{RI}} features; and ii) multiple smoothness terms encoded from neighboring point relations at different scales through our TPT modules.

Graph Matching Point Cloud Registration

Paper
Code

Balanced Chamfer Distance as a Comprehensive Metric for Point Cloud Completion

1 code implementation • NeurIPS 2021 • Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu, Dahua Lin

Point Cloud Completion

133

Paper
Code

Garment4D: Garment Reconstruction from Point Cloud Sequences

1 code implementation • NeurIPS 2021 • Fangzhou Hong, Liang Pan, Zhongang Cai, Ziwei Liu

The main challenges are two-fold: 1) effective 3D feature learning for fine details, and 2) capture of garment dynamics caused by the interaction between garments and the human body, especially for loose garments like skirts.

Garment Reconstruction

128

Paper
Code

ForgeryNet -- Face Forgery Analysis Challenge 2021: Methods and Results

no code implementations • 15 Dec 2021 • Yinan He, Lu Sheng, Jing Shao, Ziwei Liu, Zhaofan Zou, Zhizhi Guo, Shan Jiang, Curitis Sun, Guosheng Zhang, Keyao Wang, Haixiao Yue, Zhibin Hong, Wanguo Wang, Zhenyu Li, Qi Wang, Zhenli Wang, Ronghao Xu, Mingwen Zhang, Zhiheng Wang, Zhenhang Huang, Tianming Zhang, Ningning Zhao

The rapid progress of photorealistic synthesis techniques has reached a critical point where the boundary between real and manipulated images starts to blur.

valid

Paper
Add Code

Multi-View Partial (MVP) Point Cloud Challenge 2021 on Completion and Registration: Methods and Results

2 code implementations • 22 Dec 2021 • Liang Pan, Tong Wu, Zhongang Cai, Ziwei Liu, Xumin Yu, Yongming Rao, Jiwen Lu, Jie zhou, Mingye Xu, Xiaoyuan Luo, Kexue Fu, Peng Gao, Manning Wang, Yali Wang, Yu Qiao, Junsheng Zhou, Xin Wen, Peng Xiang, Yu-Shen Liu, Zhizhong Han, Yuanjie Yan, Junyi An, Lifa Zhu, Changwei Lin, Dongrui Liu, Xin Li, Francisco Gómez-Fernández, Qinlong Wang, Yang Yang

Based on the MVP dataset, this paper reports methods and results in the Multi-View Partial Point Cloud Challenge 2021 on Completion and Registration.

3D Reconstruction Point Cloud Completion +2

153

Paper
Code

Full-Range Virtual Try-On With Recurrent Tri-Level Transform

no code implementations • CVPR 2022 • Han Yang, Xinrui Yu, Ziwei Liu

Virtual try-on aims to transfer a target clothing image onto a reference person.

Ranked #4 on Virtual Try-on on VITON

Virtual Try-on

Paper
Add Code

Benchmarking and Analyzing Point Cloud Classification under Corruptions

4 code implementations • 7 Feb 2022 • Jiawei Ren, Liang Pan, Ziwei Liu

3D perception, especially point cloud classification, has achieved substantial progress.

Ranked #7 on Point Cloud Classification on PointCloud-C

Benchmarking Classification +1

162

Paper
Code

Visual Sound Localization in the Wild by Cross-Modal Interference Erasing

1 code implementation • 13 Feb 2022 • Xian Liu, Rui Qian, Hang Zhou, Di Hu, Weiyao Lin, Ziwei Liu, Bolei Zhou, Xiaowei Zhou

Specifically, we observe that the previous practice of learning only a single audio representation is insufficient due to the additive nature of audio signals.

Paper
Code

TCTrack: Temporal Contexts for Aerial Tracking

1 code implementation • CVPR 2022 • Ziang Cao, Ziyuan Huang, Liang Pan, Shiwei Zhang, Ziwei Liu, Changhong Fu

Temporal contexts among consecutive frames are far from being fully utilized in existing visual trackers.

154

Paper
Code

Conditional Prompt Learning for Vision-Language Models

9 code implementations • CVPR 2022 • Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu

With the rise of powerful pre-trained vision-language models like CLIP, it becomes essential to investigate ways to adapt these models to downstream datasets.

Ranked #3 on Prompt Engineering on ImageNet V2

Domain Generalization Prompt Engineering

1,481

Paper
Code

BiBERT: Accurate Fully Binarized BERT

1 code implementation • ICLR 2022 • Haotong Qin, Yifu Ding, Mingyuan Zhang, Qinghua Yan, Aishan Liu, Qingqing Dang, Ziwei Liu, Xianglong Liu

The large pre-trained BERT has achieved remarkable performance on Natural Language Processing (NLP) tasks but is also computation and memory expensive.

Binarization

Paper
Code

LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network

1 code implementation • 14 Mar 2022 • Fangzhou Hong, Hui Zhou, Xinge Zhu, Hongsheng Li, Ziwei Liu

In this work, we address the task of LiDAR-based panoptic segmentation, which aims to parse both objects and scenes in a unified manner.

4D Panoptic Segmentation Autonomous Driving +3

230

Paper
Code

Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy

2 code implementations • 15 Mar 2022 • Yuanhan Zhang, Qinghong Sun, Yichun Zhou, Zexin He, Zhenfei Yin, Kun Wang, Lu Sheng, Yu Qiao, Jing Shao, Ziwei Liu

This work thus proposes a novel active learning framework for realistic dataset annotation.

Ranked #1 on Image Classification on Food-101 (using extra training data)

Active Learning Classification +3

161

Paper
Code

X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation

no code implementations • 16 Mar 2022 • Yinan He, Gengshi Huang, Siyu Chen, Jianing Teng, Wang Kun, Zhenfei Yin, Lu Sheng, Ziwei Liu, Yu Qiao, Jing Shao

2) Squeeze Stage: X-Learner condenses the model to a reasonable size and learns the universal and generalizable representation for various tasks transferring.

object-detection Object Detection +3

Paper
Add Code

Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory

1 code implementation • CVPR 2022 • Li SiYao, Weijiang Yu, Tianpei Gu, Chunze Lin, Quan Wang, Chen Qian, Chen Change Loy, Ziwei Liu

With the learned choreographic memory, dance generation is realized on the quantized units that meet high choreography standards, such that the generated dancing sequences are confined within the spatial constraints.

Ranked #1 on Motion Synthesis on AIST++

Motion Synthesis

366

Paper
Code

Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer

1 code implementation • CVPR 2022 • Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy

Recent studies on StyleGAN show high performance on artistic portrait generation by transfer learning with limited data.

Style Transfer Transfer Learning +1

1,573

Paper
Code

SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance

no code implementations • 25 Mar 2022 • Xinchi Zhou, Dongzhan Zhou, Wanli Ouyang, Hang Zhou, Ziwei Liu, Di Hu

Recent years have witnessed the success of deep learning on the visual sound separation task.

Paper
Add Code

Versatile Multi-Modal Pre-Training for Human-Centric Perception

1 code implementation • CVPR 2022 • Fangzhou Hong, Liang Pan, Zhongang Cai, Ziwei Liu

To tackle the challenges, we design the novel Dense Intra-sample Contrastive Learning and Sparse Structure-aware Contrastive Learning targets by hierarchically learning a modal-invariant latent space featured with continuous and ordinal feature distribution and structure-aware semantic consistency.

Contrastive Learning Human Parsing +1

115

Paper
Code

Balanced MSE for Imbalanced Visual Regression

1 code implementation • CVPR 2022 • Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu

Data imbalance exists ubiquitously in real-world visual regressions, e. g., age estimation and pose estimation, hurting the model's generalizability and fairness.

Age Estimation Fairness +3

351

Paper
Code

Unsupervised Image-to-Image Translation with Generative Prior

1 code implementation • CVPR 2022 • Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy

In this work, we present a novel framework, Generative Prior-guided UNsupervised Image-to-image Translation (GP-UNIT), to improve the overall quality and applicability of the translation algorithm.

Translation Unsupervised Image-To-Image Translation

181

Paper
Code

Full-Spectrum Out-of-Distribution Detection

1 code implementation • 11 Apr 2022 • Jingkang Yang, Kaiyang Zhou, Ziwei Liu

In this paper, we take into account both shift types and introduce full-spectrum OOD (FS-OOD) detection, a more realistic problem setting that considers both detecting semantic shift and being tolerant to covariate shift; and designs three benchmarks.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

749

Paper
Code

Few-shot Forgery Detection via Guided Adversarial Interpolation

no code implementations • 12 Apr 2022 • Haonan Qiu, Siyu Chen, Bei Gan, Kun Wang, Huafeng Shi, Jing Shao, Ziwei Liu

Notably, our method is also validated to be robust to choices of majority and minority forgery approaches.

Paper
Add Code

StyleGAN-Human: A Data-Centric Odyssey of Human Generation

4 code implementations • 25 Apr 2022 • Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Chen Qian, Chen Change Loy, Wayne Wu, Ziwei Liu

In addition, a model zoo and human editing applications are demonstrated to facilitate future research in the community.

Image Generation

1,096

Paper
Code

Robust Face Anti-Spoofing with Dual Probabilistic Modeling

no code implementations • 27 Apr 2022 • Yuanhan Zhang, Yichao Wu, Zhenfei Yin, Jing Shao, Ziwei Liu

In this work, we attempt to fill this gap by automatically addressing the noise problem from both label and data perspectives in a probabilistic manner.

Face Anti-Spoofing

Paper
Add Code

HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling

no code implementations • 28 Apr 2022 • Zhongang Cai, Daxuan Ren, Ailing Zeng, Zhengyu Lin, Tao Yu, Wenjia Wang, Xiangyu Fan, Yang Gao, Yifan Yu, Liang Pan, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu

4D human sensing and modeling are fundamental tasks in vision and graphics with numerous applications.

Fine-grained Action Recognition Pose Estimation

Paper
Add Code

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

1 code implementation • 17 May 2022 • Fangzhou Hong, Mingyuan Zhang, Liang Pan, Zhongang Cai, Lei Yang, Ziwei Liu

Our key insight is to take advantage of the powerful vision-language model CLIP for supervising neural human generation, in terms of 3D geometry, texture and animation.

Language Modelling Motion Synthesis +1

1,039

Paper
Code

Free Lunch for Surgical Video Understanding by Distilling Self-Supervisions

1 code implementation • 19 May 2022 • Xinpeng Ding, Ziwei Liu, Xiaomeng Li

Our key insight is to distill knowledge from publicly available models trained on large generic datasets4 to facilitate the self-supervised learning of surgical videos.

Contrastive Learning Self-Supervised Learning +2

Paper
Code

Text2Human: Text-Driven Controllable Human Image Generation

2 code implementations • 31 May 2022 • Yuming Jiang, Shuai Yang, Haonan Qiu, Wayne Wu, Chen Change Loy, Ziwei Liu

In this work, we present a text-driven controllable framework, Text2Human, for a high-quality and diverse human generation.

Human Parsing Image Generation

804

Paper
Code

Sparse Mixture-of-Experts are Domain Generalizable Learners

1 code implementation • 8 Jun 2022 • Bo Li, Yifei Shen, Jingkang Yang, Yezhen Wang, Jiawei Ren, Tong Che, Jun Zhang, Ziwei Liu

It is motivated by an empirical finding that transformer-based models trained with empirical risk minimization (ERM) outperform CNN-based models employing state-of-the-art (SOTA) DG algorithms on multiple DG datasets.

Ranked #11 on Domain Generalization on DomainNet (using extra training data)

Domain Generalization Object Recognition

279

Paper
Code

Neural Prompt Search

1 code implementation • 9 Jun 2022 • Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu

The size of vision models has grown exponentially over the last few years, especially after the emergence of Vision Transformer.

Ranked #1 on Image Classification on OmniBenchmark (using extra training data)

Few-Shot Learning Image Classification +3

203

Paper
Code

Masked Frequency Modeling for Self-Supervised Visual Pre-Training

3 code implementations • 15 Jun 2022 • Jiahao Xie, Wei Li, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy

We present Masked Frequency Modeling (MFM), a unified frequency-domain-based approach for self-supervised pre-training of visual models.

Image Classification Image Restoration +2

Paper
Code

LaserMix for Semi-Supervised LiDAR Semantic Segmentation

2 code implementations • CVPR 2023 • Lingdong Kong, Jiawei Ren, Liang Pan, Ziwei Liu

Densely annotating LiDAR point clouds is costly, which restrains the scalability of fully-supervised learning methods.

Ranked #1 on Semi-Supervised Semantic Segmentation on ScribbleKITTI

LIDAR Semantic Segmentation Segmentation +1

255

Paper
Code

Detecting and Recovering Sequential DeepFake Manipulation

1 code implementation • 5 Jul 2022 • Rui Shao, Tianxing Wu, Ziwei Liu

Moreover, we build a comprehensive benchmark and set up rigorous evaluation protocols and metrics for this new research problem.

DeepFake Detection Face Swapping +2

118

Paper
Code

Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis

1 code implementation • 11 Jul 2022 • Long Zhuo, Guangcong Wang, Shikai Li, Wayne Wu, Ziwei Liu

In this paper, we present a spatial-temporal compression framework, \textbf{Fast-Vid2Vid}, which focuses on data aspects of generative models.

Knowledge Distillation Motion Compensation +1

150

Paper
Code

Benchmarking Omni-Vision Representation through the Lens of Visual Realms

1 code implementation • 14 Jul 2022 • Yuanhan Zhang, Zhenfei Yin, Jing Shao, Ziwei Liu

We benchmark ReCo and other advances in omni-vision representation studies that are different in architectures (from CNNs to transformers) and in learning paradigms (from supervised learning to self-supervised learning) on OmniBenchmark.

Benchmarking Contrastive Learning +2

105

Paper
Code

Relighting4D: Neural Relightable Human from Videos

1 code implementation • 14 Jul 2022 • Zhaoxi Chen, Ziwei Liu

Our key insight is that the space-time varying geometry and reflectance of the human body can be decomposed as a set of neural fields of normal, occlusion, diffuse, and specular maps.

254

Paper
Code

UNIF: United Neural Implicit Functions for Clothed Human Reconstruction and Animation

1 code implementation • 20 Jul 2022 • Shenhan Qian, Jiale Xu, Ziwei Liu, Liqian Ma, Shenghua Gao

We propose united implicit functions (UNIF), a part-based method for clothed human reconstruction and animation with raw scans and skeletons as the input.

Position

Paper
Code

Panoptic Scene Graph Generation

1 code implementation • 22 Jul 2022 • Jingkang Yang, Yi Zhe Ang, Zujin Guo, Kaiyang Zhou, Wayne Zhang, Ziwei Liu

Existing research addresses scene graph generation (SGG) -- a critical technology for scene understanding in images -- from a detection perspective, i. e., objects are detected using bounding boxes followed by prediction of their pairwise relationships.

Ranked #5 on Panoptic Scene Graph Generation on PSG Dataset

Benchmarking Panoptic Scene Graph Generation +1

387

Paper
Code

CelebV-HQ: A Large-Scale Video Facial Attributes Dataset

1 code implementation • 25 Jul 2022 • Hao Zhu, Wayne Wu, Wentao Zhu, Liming Jiang, Siwei Tang, Li Zhang, Ziwei Liu, Chen Change Loy

Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields.

Ranked #1 on Unconditional Video Generation on CelebV-HQ

Attribute Face Generation +1

352

Paper
Code

Multi-Forgery Detection Challenge 2022: Push the Frontier of Unconstrained and Diverse Forgery Detection

no code implementations • 27 Jul 2022 • Jianshu Li, Man Luo, Jian Liu, Tao Chen, Chengjie Wang, Ziwei Liu, Shuo Liu, Kewei Yang, Xuning Shao, Kang Chen, Boyuan Liu, Mingyu Guo, Ying Guo, Yingying Ao, Pengfei Gao

In this paper, we present the solutions from the Top 3 teams, in order to boost the research work in the field of image forgery detection.

Image Forgery Detection Image Generation +1

Paper
Add Code

StyleLight: HDR Panorama Generation for Lighting Estimation and Editing

1 code implementation • 29 Jul 2022 • Guangcong Wang, Yinuo Yang, Chen Change Loy, Ziwei Liu

To tackle this problem, we propose a coupled dual-StyleGAN panorama synthesis network (StyleLight) that integrates LDR and HDR panorama synthesis into a unified framework.

Lighting Estimation

113

Paper
Code

Exploring Point-BEV Fusion for 3D Point Cloud Object Tracking with Transformer

1 code implementation • 10 Aug 2022 • Zhipeng Luo, Changqing Zhou, Liang Pan, Gongjie Zhang, Tianrui Liu, Yueru Luo, Haiyu Zhao, Ziwei Liu, Shijian Lu

In a point cloud sequence, 3D object tracking aims to predict the location and orientation of an object in consecutive frames given an object template.

3D Object Tracking Autonomous Driving +3

116

Paper
Code

StyleFaceV: Face Video Generation via Decomposing and Recomposing Pretrained StyleGAN3

1 code implementation • 16 Aug 2022 • Haonan Qiu, Yuming Jiang, Hang Zhou, Wayne Wu, Ziwei Liu

Notably, StyleFaceV is capable of generating realistic $1024\times1024$ face videos even without high-resolution training videos.

Image Generation Video Generation

131

Paper
Code

Open Long-Tailed Recognition in a Dynamic World

no code implementations • 17 Aug 2022 • Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, Stella X. Yu

A practical recognition system must balance between majority (head) and minority (tail) classes, generalize across the distribution, and acknowledge novelty upon the instances of unseen classes (open classes).

Active Learning Classification +4

Paper
Add Code

Mind the Gap in Distilling StyleGANs

1 code implementation • 18 Aug 2022 • Guodong Xu, Yuenan Hou, Ziwei Liu, Chen Change Loy

To further enhance the semantic consistency between the teacher and student model, we present a latent-direction-based distillation loss that preserves the semantic relations in latent space.

Knowledge Distillation

Paper
Code

Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction

1 code implementation • 26 Aug 2022 • Tong Wu, Jiaqi Wang, Xingang Pan, Xudong Xu, Christian Theobalt, Ziwei Liu, Dahua Lin

Previous methods based on neural volume rendering mostly train a fully implicit model with MLPs, which typically require hours of training for a single scene.

Surface Reconstruction

399

Paper
Code

MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model

2 code implementations • 31 Aug 2022 • Mingyuan Zhang, Zhongang Cai, Liang Pan, Fangzhou Hong, Xinying Guo, Lei Yang, Ziwei Liu

Instead of a deterministic language-motion mapping, MotionDiffuse generates motions through a series of denoising steps in which variations are injected.

Ranked #17 on Motion Synthesis on KIT Motion-Language

Denoising Motion Synthesis

777

Paper
Code

On-Device Domain Generalization

2 code implementations • 15 Sep 2022 • Kaiyang Zhou, Yuanhan Zhang, Yuhang Zang, Jingkang Yang, Chen Change Loy, Ziwei Liu

Another interesting observation is that the teacher-student gap on out-of-distribution data is bigger than that on in-distribution data, which highlights the capacity mismatch issue as well as the shortcoming of KD.

Data Augmentation Domain Generalization +2

256

Paper
Code

Text2Light: Zero-Shot Text-Driven HDR Panorama Generation

1 code implementation • 20 Sep 2022 • Zhaoxi Chen, Guangcong Wang, Ziwei Liu

To achieve super-resolution inverse tone mapping, we derive a continuous representation of 360-degree imaging from the LDR panorama as a set of structured latent codes anchored to the sphere.

4k inverse tone mapping +3

541

Paper
Code

Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond Algorithms

1 code implementation • 21 Sep 2022 • Hui En Pang, Zhongang Cai, Lei Yang, Tianwei Zhang, Ziwei Liu

Experiments with 10 backbones, ranging from CNNs to transformers, show the knowledge learnt from a proximity task is readily transferable to human mesh recovery.

3D human pose and shape estimation Benchmarking +1

112

Paper
Code

VToonify: Controllable High-Resolution Portrait Video Style Transfer

1 code implementation • 22 Sep 2022 • Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy

Although a series of successful portrait image toonification models built upon the powerful StyleGAN have been proposed, these image-oriented methods have obvious limitations when applied to videos, such as the fixed frame size, the requirement of face alignment, missing non-facial details and temporal inconsistency.

Face Alignment Style Transfer +2

3,463

Paper
Code

StyleSwap: Style-Based Generator Empowers Robust Face Swapping

no code implementations • 27 Sep 2022 • Zhiliang Xu, Hang Zhou, Zhibin Hong, Ziwei Liu, Jiaming Liu, Zhizhi Guo, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Our core idea is to leverage a style-based generator to empower high-fidelity and robust face swapping, thus the generator's advantage can be adopted for optimizing identity similarity.

Face Swapping

Paper
Add Code

TripleE: Easy Domain Generalization via Episodic Replay

1 code implementation • 4 Oct 2022 • Xiaomeng Li, Hongyu Ren, Huifeng Yao, Ziwei Liu

In this paper, we propose TripleE, and the main idea is to encourage the network to focus on training on subsets (learning with replay) and enlarge the data space in learning on subsets.

Domain Generalization

Paper
Code

EVA3D: Compositional 3D Human Generation from 2D Image Collections

1 code implementation • 10 Oct 2022 • Fangzhou Hong, Zhaoxi Chen, Yushi Lan, Liang Pan, Ziwei Liu

At the core of EVA3D is a compositional human NeRF representation, which divides the human body into local parts.

567

Paper
Code

OpenOOD: Benchmarking Generalized Out-of-Distribution Detection

3 code implementations • 13 Oct 2022 • Jingkang Yang, Pengyun Wang, Dejian Zou, Zitang Zhou, Kunyuan Ding, Wenxuan Peng, Haoqi Wang, Guangyao Chen, Bo Li, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Dan Hendrycks, Yixuan Li, Ziwei Liu

Out-of-distribution (OOD) detection is vital to safety-critical machine learning applications and has thus been extensively studied, with a plethora of methods developed in the literature.

Anomaly Detection Benchmarking +3

749

Paper
Code

Joint Communication and Computation Design in Transmissive RMS Transceiver Enabled Multi-Tier Computing Networks

no code implementations • 27 Oct 2022 • Zhendong Li, Wen Chen, Ziwei Liu, Hongying Tang, Jianmin Lu

We formulate a total energy consumption minimization problem by a joint optimization of subcarrier allocation, task input bits, time slot allocation, transmit power allocation and RMS transmissive coefficient while taking into account the constraints of communication resources and computing resources.

Total Energy

Paper
Add Code

AnimeRun: 2D Animation Visual Correspondence from Open Source 3D Movies

1 code implementation • 10 Nov 2022 • Li SiYao, Yuhang Li, Bo Li, Chao Dong, Ziwei Liu, Chen Change Loy

Existing correspondence datasets for two-dimensional (2D) cartoon suffer from simple frame composition and monotonic movements, making them insufficient to simulate real animations.

Optical Flow Estimation

Paper
Code

Audio-Driven Co-Speech Gesture Video Generation

no code implementations • 5 Dec 2022 • Xian Liu, Qianyi Wu, Hang Zhou, Yuanqi Du, Wayne Wu, Dahua Lin, Ziwei Liu

Our key insight is that the co-speech gestures can be decomposed into common motion patterns and subtle rhythmic dynamics.

Video Generation

Paper
Add Code

Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers

no code implementations • 9 Dec 2022 • Yasheng Sun, Hang Zhou, Kaisiyuan Wang, Qianyi Wu, Zhibin Hong, Jingtuo Liu, Errui Ding, Jingdong Wang, Ziwei Liu, Hideki Koike

This requires masking a large percentage of the original image and seamlessly inpainting it with the aid of audio and reference frames.

Paper
Add Code

Reference-based Image and Video Super-Resolution via C2-Matching

1 code implementation • 19 Dec 2022 • Yuming Jiang, Kelvin C. K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu

To tackle these challenges, we propose C2-Matching in this work, which performs explicit robust matching crossing transformation and resolution.

Image Super-Resolution Reference-based Super-Resolution +2

192

Paper
Code

DeformToon3D: Deformable Neural Radiance Fields for 3D Toonification

no code implementations • ICCV 2023 • Junzhe Zhang, Yushi Lan, Shuai Yang, Fangzhou Hong, Quan Wang, Chai Kiat Yeo, Ziwei Liu, Chen Change Loy

In this paper, we address the challenging problem of 3D toonification, which involves transferring the style of an artistic domain onto a target 3D face with stylized geometry and texture.

Paper
Add Code

F2-NeRF: Fast Neural Radiance Field Training With Free Camera Trajectories

no code implementations • CVPR 2023 • Peng Wang, YuAn Liu, Zhaoxi Chen, Lingjie Liu, Ziwei Liu, Taku Komura, Christian Theobalt, Wenping Wang

Existing fast grid-based NeRF training frameworks, like Instant-NGP, Plenoxels, DVGO, or TensoRF, are mainly designed for bounded scenes and rely on space warping to handle unbounded scenes.

Novel View Synthesis

Paper
Add Code

OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation

1 code implementation • CVPR 2023 • Tong Wu, Jiarui Zhang, Xiao Fu, Yuxin Wang, Jiawei Ren, Liang Pan, Wayne Wu, Lei Yang, Jiaqi Wang, Chen Qian, Dahua Lin, Ziwei Liu

Recent advances in modeling 3D objects mostly rely on synthetic datasets due to the lack of large-scale realscanned 3D databases.

Novel View Synthesis Object +1

420

Paper
Code

BiBench: Benchmarking and Analyzing Network Binarization

1 code implementation • 26 Jan 2023 • Haotong Qin, Mingyuan Zhang, Yifu Ding, Aoyu Li, Zhongang Cai, Ziwei Liu, Fisher Yu, Xianglong Liu

Network binarization emerges as one of the most promising compression approaches offering extraordinary computation and memory savings by minimizing the bit-width.

Benchmarking Binarization

Paper
Code

What Makes Good Examples for Visual In-Context Learning?

1 code implementation • NeurIPS 2023 • Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu

To overcome the problem, we propose a prompt retrieval framework to automate the selection of in-context examples.

In-Context Learning Retrieval

155

Paper
Code

SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections

1 code implementation • 2 Feb 2023 • Zhaoxi Chen, Guangcong Wang, Ziwei Liu

Our approach begins with an efficient bird's-eye-view (BEV) representation generated from simplex noise, which includes a height field for surface elevation and a semantic field for detailed scene semantics.

Ranked #3 on Scene Generation on GoogleEarth

Scene Generation

568

Paper
Code

Deep Class-Incremental Learning: A Survey

3 code implementations • 7 Feb 2023 • Da-Wei Zhou, Qi-Wei Wang, Zhi-Hong Qi, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu

Deep models, e. g., CNNs and Vision Transformers, have achieved impressive achievements in many vision tasks in the closed world.

Class Incremental Learning Image Classification +1

690

Paper
Code

Make Your Brief Stroke Real and Stereoscopic: 3D-Aware Simplified Sketch to Portrait Generation

no code implementations • 14 Feb 2023 • Yasheng Sun, Qianyi Wu, Hang Zhou, Kaisiyuan Wang, Tianshu Hu, Chen-Chieh Liao, Shio Miyafuji, Ziwei Liu, Hideki Koike

Creating the photo-realistic version of people sketched portraits is useful to various entertainment purposes.

Paper
Add Code

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

no code implementations • 18 Feb 2023 • Ce Zhou, Qian Li, Chen Li, Jun Yu, Yixin Liu, Guangjing Wang, Kai Zhang, Cheng Ji, Qiben Yan, Lifang He, Hao Peng, JianXin Li, Jia Wu, Ziwei Liu, Pengtao Xie, Caiming Xiong, Jian Pei, Philip S. Yu, Lichao Sun

This study provides a comprehensive review of recent research advancements, challenges, and opportunities for PFMs in text, image, graph, as well as other data modalities.

Graph Learning Language Modelling +1

Paper
Add Code

Rethinking Range View Representation for LiDAR Segmentation

no code implementations • ICCV 2023 • Lingdong Kong, Youquan Liu, Runnan Chen, Yuexin Ma, Xinge Zhu, Yikang Li, Yuenan Hou, Yu Qiao, Ziwei Liu

We show that, for the first time, a range view method is able to surpass the point, voxel, and multi-view fusion counterparts in the competing LiDAR semantic and panoptic segmentation benchmarks, i. e., SemanticKITTI, nuScenes, and ScribbleKITTI.

Ranked #4 on 3D Semantic Segmentation on SemanticKITTI

3D Semantic Segmentation Autonomous Driving +4

Paper
Add Code

StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces

1 code implementation • ICCV 2023 • Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy

Recent advances in face manipulation using StyleGAN have produced impressive results.

Attribute Super-Resolution

468

Paper
Code

Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need

2 code implementations • 13 Mar 2023 • Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu

ADAM is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM's generalizability and adapted model's adaptivity.

Class Incremental Learning Incremental Learning +1

690

Paper
Code

Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation

1 code implementation • CVPR 2023 • Lingting Zhu, Xian Liu, Xuanyu Liu, Rui Qian, Ziwei Liu, Lequan Yu

In this work, we propose a novel diffusion-based framework, named Diffusion Co-Speech Gesture (DiffGesture), to effectively capture the cross-modal audio-to-gesture associations and preserve temporal coherence for high-fidelity audio-driven co-speech gesture generation.

Gesture Generation

210

Paper
Code

SHERF: Generalizable Human NeRF from a Single Image

1 code implementation • ICCV 2023 • Shoukang Hu, Fangzhou Hong, Liang Pan, Haiyi Mei, Lei Yang, Ziwei Liu

To this end, we propose a bank of 3D-aware hierarchical features, including global, point-level, and pixel-aligned features, to facilitate informative encoding.

3D Human Reconstruction

285

Paper
Code

ReVersion: Diffusion-Based Relation Inversion from Images

2 code implementations • 23 Mar 2023 • Ziqi Huang, Tianxing Wu, Yuming Jiang, Kelvin C. K. Chan, Ziwei Liu

Specifically, we propose a novel relation-steering contrastive learning scheme to impose two critical properties of the relation prompt: 1) The relation prompt should capture the interaction between objects, enforced by the preposition prior.

Contrastive Learning Relation

427

Paper
Code

A Simple and Generic Framework for Feature Distillation via Channel-wise Transformation

no code implementations • 23 Mar 2023 • Ziwei Liu, Yongtao Wang, Xiaojie Chu

Specifically, we propose a learnable nonlinear channel-wise transformation to align the features of the student and the teacher model.

Image Classification Instance Segmentation +5

Paper
Add Code

F$^{2}$-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories

1 code implementation • 28 Mar 2023 • Peng Wang, YuAn Liu, Zhaoxi Chen, Lingjie Liu, Ziwei Liu, Taku Komura, Christian Theobalt, Wenping Wang

Based on our analysis, we further propose a novel space-warping method called perspective warping, which allows us to handle arbitrary trajectories in the grid-based NeRF framework.

Novel View Synthesis

897

Paper
Code

SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis

no code implementations • ICCV 2023 • Guangcong Wang, Zhaoxi Chen, Chen Change Loy, Ziwei Liu

Since coarse depth maps are not strictly scaled to the ground-truth depth maps, we propose a simple yet effective constraint, a local depth ranking method, on NeRFs such that the expected depth ranking of the NeRF is consistent with that of the coarse depth maps in local patches.

Novel View Synthesis

Paper
Add Code

SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling

1 code implementation • ICCV 2023 • Zhitao Yang, Zhongang Cai, Haiyi Mei, Shuai Liu, Zhaoxi Chen, Weiye Xiao, Yukun Wei, Zhongfei Qing, Chen Wei, Bo Dai, Wayne Wu, Chen Qian, Dahua Lin, Ziwei Liu, Lei Yang

Synthetic data has emerged as a promising source for 3D human research as it offers low-cost access to large-scale human datasets.

Human Mesh Recovery Neural Rendering

170

Paper
Code

Robo3D: Towards Robust and Reliable 3D Perception against Corruptions

1 code implementation • ICCV 2023 • Lingdong Kong, Youquan Liu, Xin Li, Runnan Chen, Wenwei Zhang, Jiawei Ren, Liang Pan, Kai Chen, Ziwei Liu

The robustness of 3D perception systems under natural corruptions from environments and sensors is pivotal for safety-critical applications.

Robust 3D Object Detection Robust 3D Semantic Segmentation

274

Paper
Code

ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model

1 code implementation • ICCV 2023 • Mingyuan Zhang, Xinying Guo, Liang Pan, Zhongang Cai, Fangzhou Hong, Huirong Li, Lei Yang, Ziwei Liu

However, the performance on more diverse motions remains unsatisfactory.

Ranked #1 on Motion Synthesis on KIT Motion-Language

Denoising Motion Synthesis +1

292

Paper
Code

Detecting and Grounding Multi-Modal Media Manipulation

1 code implementation • CVPR 2023 • Rui Shao, Tianxing Wu, Ziwei Liu

In this paper, we highlight a new research problem for multi-modal fake media, namely Detecting and Grounding Multi-Modal Media Manipulation (DGM^4).

Binary Classification Contrastive Learning +4

277

Paper
Code

DiffMimic: Efficient Motion Mimicking with Differentiable Physics

2 code implementations • 6 Apr 2023 • Jiawei Ren, Cunjun Yu, Siwei Chen, Xiao Ma, Liang Pan, Ziwei Liu

Motion mimicking is a foundational task in physics-based character animation.

reinforcement-learning Reinforcement Learning (RL)

258

Paper
Code

RoboBEV: Towards Robust Bird's Eye View Perception under Corruptions

1 code implementation • 13 Apr 2023 • Shaoyuan Xie, Lingdong Kong, Wenwei Zhang, Jiawei Ren, Liang Pan, Kai Chen, Ziwei Liu

Our experiments further demonstrate that pre-training and depth-free BEV transformation has the potential to enhance out-of-distribution robustness.

Robust Camera Only 3D Object Detection

284

Paper
Code

Text2Performer: Text-Driven Human Video Generation

1 code implementation • ICCV 2023 • Yuming Jiang, Shuai Yang, Tong Liang Koh, Wayne Wu, Chen Change Loy, Ziwei Liu

In this work, we present Text2Performer to generate vivid human videos with articulated motions from texts.

Video Generation

308

Paper
Code

Variational Relational Point Completion Network for Robust 3D Classification

no code implementations • 18 Apr 2023 • Liang Pan, Xinyi Chen, Zhongang Cai, Junzhe Zhang, Haiyu Zhao, Shuai Yi, Ziwei Liu

Existing point cloud completion methods tend to generate global shape skeletons and hence lack fine local details.

3D Classification Classification +1

Paper
Add Code

Transformer-Based Visual Segmentation: A Survey

2 code implementations • 19 Apr 2023 • Xiangtai Li, Henghui Ding, Haobo Yuan, Wenwei Zhang, Jiangmiao Pang, Guangliang Cheng, Kai Chen, Ziwei Liu, Chen Change Loy

Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various vision processing tasks.

Autonomous Driving Point Cloud Segmentation +1

574

Paper
Code

Collaborative Diffusion for Multi-Modal Face Generation and Editing

1 code implementation • CVPR 2023 • Ziqi Huang, Kelvin C. K. Chan, Yuming Jiang, Ziwei Liu

In this work, we present Collaborative Diffusion, where pre-trained uni-modal diffusion models collaborate to achieve multi-modal face generation and editing without re-training.

Denoising Face Generation

373

Paper
Code

Transmissive Reconfigurable Intelligent Surface Transmitter Empowered Cognitive RSMA Networks

no code implementations • 4 May 2023 • Ziwei Liu, Wen Chen, Zhendong Li, Jinhong Yuan, Qingqing Wu, Kunlun Wang

In this paper, we investigated the downlink transmission problem of a cognitive radio network (CRN) equipped with a novel transmissive reconfigurable intelligent surface (TRIS) transmitter.

Paper
Add Code

Otter: A Multi-Modal Model with In-Context Instruction Tuning

1 code implementation • 5 May 2023 • Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Jingkang Yang, Ziwei Liu

Large language models (LLMs) have demonstrated significant universal capabilities as few/zero-shot learners in various tasks due to their pre-training on vast amounts of text data, as exemplified by GPT-3, which boosted to InstrctGPT and ChatGPT, effectively following natural language instructions to accomplish real-world tasks.

Ranked #8 on Visual Question Answering on BenchLMM

In-Context Learning Instruction Following +2

3,446

Paper
Code

StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator

no code implementations • CVPR 2023 • Jiazhi Guan, Zhanwang Zhang, Hang Zhou, Tianshu Hu, Kaisiyuan Wang, Dongliang He, Haocheng Feng, Jingtuo Liu, Errui Ding, Ziwei Liu, Jingdong Wang

Despite recent advances in syncing lip movements with any audio waves, current methods still struggle to balance generation quality and the model's generalization ability.

Paper
Add Code

ConsistentNeRF: Enhancing Neural Radiance Fields with 3D Consistency for Sparse View Synthesis

1 code implementation • 18 May 2023 • Shoukang Hu, Kaichen Zhou, Kaiyu Li, Longhui Yu, Lanqing Hong, Tianyang Hu, Zhenguo Li, Gim Hee Lee, Ziwei Liu

In this paper, we propose ConsistentNeRF, a method that leverages depth information to regularize both multi-view and single-view 3D consistency among pixels.

3D Reconstruction SSIM

Paper
Code

RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars

1 code implementation • NeurIPS 2023 • Dongwei Pan, Long Zhuo, Jingtan Piao, Huiwen Luo, Wei Cheng, Yuxin Wang, Siming Fan, Shengqi Liu, Lei Yang, Bo Dai, Ziwei Liu, Chen Change Loy, Chen Qian, Wayne Wu, Dahua Lin, Kwan-Yee Lin

It is a large-scale digital library for head avatars with three key attributes: 1) High Fidelity: all subjects are captured by 60 synchronized, high-resolution 2K cameras in 360 degrees.

2k Image Matting +2

217

Paper
Code

SAD: Segment Any RGBD

1 code implementation • 23 May 2023 • Jun Cen, Yizheng Wu, Kewei Wang, Xingyi Li, Jingkang Yang, Yixuan Pei, Lingdong Kong, Ziwei Liu, Qifeng Chen

The Segment Anything Model (SAM) has demonstrated its effectiveness in segmenting any part of 2D RGB images.

Open Vocabulary Semantic Segmentation Panoptic Segmentation +1

723

Paper
Code

Learning without Forgetting for Vision-Language Models

no code implementations • 30 May 2023 • Da-Wei Zhou, Yuanhan Zhang, Jingyi Ning, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu

While traditional CIL methods focus on visual information to grasp core features, recent advances in Vision-Language Models (VLM) have shown promising capabilities in learning generalizable representations with the aid of textual information.

Class Incremental Learning Incremental Learning

Paper
Add Code

DeepFake-Adapter: Dual-Level Adapter for DeepFake Detection

1 code implementation • 1 Jun 2023 • Rui Shao, Tianxing Wu, Liqiang Nie, Ziwei Liu

Unlike existing deepfake detection methods merely focusing on low-level forgery patterns, the forgery detection process of our model can be regularized by generalizable high-level semantics from a pre-trained ViT and adapted by global and local low-level forgeries of deepfake data.

DeepFake Detection Face Swapping

Paper
Code

GP-UNIT: Generative Prior for Versatile Unsupervised Image-to-Image Translation

1 code implementation • 7 Jun 2023 • Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy

In this paper, we introduce a novel versatile framework, Generative Prior-guided UNsupervised Image-to-image Translation (GP-UNIT), that improves the quality, applicability and controllability of the existing translation models.

Translation Unsupervised Image-To-Image Translation +1

181

Paper
Code

MIMIC-IT: Multi-Modal In-Context Instruction Tuning

2 code implementations • 8 Jun 2023 • Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Fanyi Pu, Jingkang Yang, Chunyuan Li, Ziwei Liu

We release the MIMIC-IT dataset, instruction-response collection pipeline, benchmarks, and the Otter model.

Ranked #83 on Visual Question Answering on MM-Vet

In-Context Learning Visual Question Answering

3,446

Paper
Code

Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

no code implementations • 13 Jun 2023 • Shuai Yang, Yifan Zhou, Ziwei Liu, Chen Change Loy

The framework includes two parts: key frame translation and full video translation.

Patch Matching Translation

Paper
Add Code

Segment Any Point Cloud Sequences by Distilling Vision Foundation Models

2 code implementations • NeurIPS 2023 • Youquan Liu, Lingdong Kong, Jun Cen, Runnan Chen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu

Recent advancements in vision foundation models (VFMs) have opened up new possibilities for versatile and efficient visual perception.

Representation Learning Transfer Learning

497

Paper
Code

OpenOOD v1.5: Enhanced Benchmark for Out-of-Distribution Detection

1 code implementation • 15 Jun 2023 • Jingyang Zhang, Jingkang Yang, Pengyun Wang, Haoqi Wang, Yueqian Lin, Haoran Zhang, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Yixuan Li, Ziwei Liu, Yiran Chen, Hai Li

Out-of-Distribution (OOD) detection is critical for the reliable operation of open-world intelligent systems.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

749

Paper
Code

FunQA: Towards Surprising Video Comprehension

1 code implementation • 26 Jun 2023 • Binzhu Xie, Sicheng Zhang, Zitang Zhou, Bo Li, Yuanhan Zhang, Jack Hessel, Jingkang Yang, Ziwei Liu

Surprising videos, such as funny clips, creative performances, or visual illusions, attract significant attention.

Question Answering Text Generation +3

Paper
Code

MMBench: Is Your Multi-modal Model an All-around Player?

2 code implementations • 12 Jul 2023 • YuAn Liu, Haodong Duan, Yuanhan Zhang, Bo Li, Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, Kai Chen, Dahua Lin

In response to these challenges, we propose MMBench, a novel multi-modality benchmark.

Ranked #1 on Visual Question Answering on MMBench

Visual Question Answering

2,503

Paper
Code

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

1 code implementation • 13 Jul 2023 • Yi Wang, Yinan He, Yizhuo Li, Kunchang Li, Jiashuo Yu, Xin Ma, Xinhao Li, Guo Chen, Xinyuan Chen, Yaohui Wang, Conghui He, Ping Luo, Ziwei Liu, Yali Wang, LiMin Wang, Yu Qiao

Specifically, we utilize a multi-scale approach to generate video-related descriptions.

Action Recognition Contrastive Learning +7

921

Paper
Code

Pair then Relation: Pair-Net for Panoptic Scene Graph Generation

1 code implementation • 17 Jul 2023 • Jinghao Wang, Zhengyu Wen, Xiangtai Li, Zujin Guo, Jingkang Yang, Ziwei Liu

Panoptic Scene Graph (PSG) is a challenging task in Scene Graph Generation (SGG) that aims to create a more comprehensive scene graph representation using panoptic segmentation instead of boxes.

Graph Generation Panoptic Scene Graph Generation +2

Paper
Code

Make-A-Volume: Leveraging Latent Diffusion Models for Cross-Modality 3D Brain MRI Synthesis

no code implementations • 19 Jul 2023 • Lingting Zhu, Zeyue Xue, Zhenchao Jin, Xian Liu, Jingzhen He, Ziwei Liu, Lequan Yu

This paradigm extends the 2D image diffusion model to a volumetric version with a slightly increasing number of parameters and computation, offering a principled solution for generic cross-modality 3D medical image synthesis.

Computational Efficiency Image Generation

Paper
Add Code

DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering

1 code implementation • ICCV 2023 • Wei Cheng, Ruixiang Chen, Wanqi Yin, Siming Fan, Keyu Chen, Honglin He, Huiwen Luo, Zhongang Cai, Jingbo Wang, Yang Gao, Zhengming Yu, Zhengyu Lin, Daxuan Ren, Lei Yang, Ziwei Liu, Chen Change Loy, Chen Qian, Wayne Wu, Dahua Lin, Bo Dai, Kwan-Yee Lin

Realistic human-centric rendering plays a key role in both computer vision and computer graphics.

Camera Calibration Novel View Synthesis

199

Paper
Code

Benchmarking and Analyzing Generative Data for Visual Recognition

no code implementations • 25 Jul 2023 • Bo Li, Haotian Liu, Liangyu Chen, Yong Jae Lee, Chunyuan Li, Ziwei Liu

Advancements in large pre-trained generative models have expanded their potential as effective data generators in visual recognition.

Benchmarking Retrieval

Paper
Add Code

Temporally-Adaptive Models for Efficient Video Understanding

1 code implementation • 10 Aug 2023 • Ziyuan Huang, Shiwei Zhang, Liang Pan, Zhiwu Qing, Yingya Zhang, Ziwei Liu, Marcelo H. Ang Jr

Spatial convolutions are extensively used in numerous deep video models.

Ranked #3 on Action Recognition on EPIC-KITCHENS-100 (using extra training data)

Action Classification Action Recognition +1

215

Paper
Code

Hierarchy Flow For High-Fidelity Image-to-Image Translation

1 code implementation • 14 Aug 2023 • Weichen Fan, Jinghuan Chen, Ziwei Liu

In this work, we propose Hierarchy Flow, a novel flow-based model to achieve better content preservation during translation.

Image-to-Image Translation Style Transfer +1

Paper
Code

Link-Context Learning for Multimodal LLMs

1 code implementation • 15 Aug 2023 • Yan Tai, Weichen Fan, Zhao Zhang, Feng Zhu, Rui Zhao, Ziwei Liu

The ability to learn from context with novel concepts, and deliver appropriate responses are essential in human conversations.

Few-Shot Learning In-Context Learning +1

Paper
Code

HumanLiff: Layer-wise 3D Human Generation with Diffusion Model

no code implementations • 18 Aug 2023 • Shoukang Hu, Fangzhou Hong, Tao Hu, Liang Pan, Haiyi Mei, Weiye Xiao, Lei Yang, Ziwei Liu

In this work, we propose HumanLiff, the first layer-wise 3D human generative model with a unified diffusion process.

3D Generation Neural Rendering

Paper
Add Code

Towards Real-World Visual Tracking with Temporal Contexts

1 code implementation • 20 Aug 2023 • Ziang Cao, Ziyuan Huang, Liang Pan, Shiwei Zhang, Ziwei Liu, Changhong Fu

To handle those problems, we propose a two-level framework (TCTrack) that can exploit temporal contexts efficiently.

Visual Tracking

154

Paper
Code

PointHPS: Cascaded 3D Human Pose and Shape Estimation from Point Clouds

no code implementations • 28 Aug 2023 • Zhongang Cai, Liang Pan, Chen Wei, Wanqi Yin, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu

To tackle these challenges, we propose a principled framework, PointHPS, for accurate 3D HPS from point clouds captured in real-world settings, which iteratively refines point features through a cascaded architecture.

3D human pose and shape estimation

Paper
Add Code

CityDreamer: Compositional Generative Model of Unbounded 3D Cities

1 code implementation • 1 Sep 2023 • Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu

3D city generation is a desirable yet challenging task, since humans are more sensitive to structural distortions in urban environments.

Ranked #1 on Scene Generation on OSM

Scene Generation

483

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.