2 code implementations • ICCV 2015 • Ziwei Liu, Ping Luo, Xiaogang Wang, Xiaoou Tang
LNet is pre-trained by massive general object categories for face localization, while ANet is pre-trained by massive face identities for attribute prediction.
Ranked #6 on Facial Attribute Classification on LFWA
no code implementations • ICCV 2015 • Ziwei Liu, Xiaoxiao Li, Ping Luo, Chen Change Loy, Xiaoou Tang
This paper addresses semantic image segmentation by incorporating rich information into Markov Random Field (MRF), including high-order relations and mixture of label contexts.
Ranked #89 on Semantic Segmentation on Cityscapes test
no code implementations • CVPR 2016 • Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, Xiaoou Tang
To demonstrate the advantages of DeepFashion, we propose a new deep model, namely FashionNet, which learns clothing features by jointly predicting clothing attributes and landmarks.
no code implementations • 23 Jun 2016 • Ziwei Liu, Xiaoxiao Li, Ping Luo, Chen Change Loy, Xiaoou Tang
Semantic segmentation tasks can be well modeled by Markov Random Field (MRF).
4 code implementations • 10 Aug 2016 • Ziwei Liu, Sijie Yan, Ping Luo, Xiaogang Wang, Xiaoou Tang
Fashion landmark is also compared to clothing bounding boxes and human joints in two applications, fashion attribute prediction and clothes retrieval, showing that fashion landmark is a more discriminative representation to understand fashion images.
no code implementations • 30 Nov 2016 • Raymond Yeh, Ziwei Liu, Dan B. Goldman, Aseem Agarwala
High-level manipulation of facial expressions in images --- such as changing a smile to a neutral expression --- is challenging because facial expression changes are highly non-linear, and vary depending on the appearance of the face.
3 code implementations • ICCV 2017 • Ziwei Liu, Raymond A. Yeh, Xiaoou Tang, Yiming Liu, Aseem Agarwala
We combine the advantages of these two methods by training a deep network that learns to synthesize video frames by flowing pixel values from existing ones, which we call deep voxel flow.
Ranked #2 on Video Prediction on DAVIS 2017
1 code implementation • CVPR 2017 • Xiaoxiao Li, Ziwei Liu, Ping Luo, Chen Change Loy, Xiaoou Tang
Third, in comparison to MC, LC is an end-to-end trainable framework, allowing joint learning of all sub-models.
Ranked #22 on Semantic Segmentation on PASCAL VOC 2012 test
3 code implementations • 1 Aug 2017 • Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi, Ping Luo, Xiaoou Tang, Chen Change Loy
Specifically, our Video Object Segmentation with Re-identification (VS-ReID) model includes a mask propagation module and a ReID module.
2 code implementations • 7 Aug 2017 • Sijie Yan, Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, Xiaoou Tang
This work addresses unconstrained fashion landmark detection, where clothing bounding boxes are not provided in both training and test.
no code implementations • 2 Dec 2017 • Xiaohang Zhan, Ziwei Liu, Ping Luo, Xiaoou Tang, Chen Change Loy
The key of this new form of learning is to design a proxy task (e. g. image colorization), from which a discriminative loss can be formulated on unlabeled data.
18 code implementations • 24 Jan 2018 • Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, Justin M. Solomon
Point clouds provide a flexible geometric representation suitable for countless applications in computer graphics; they also comprise the raw output of most 3D data acquisition devices.
Ranked #6 on Point Cloud Segmentation on PointCloud-C
1 code implementation • ECCV 2018 • Tsung-Wei Ke, Jyh-Jing Hwang, Ziwei Liu, Stella X. Yu
Semantic segmentation has made much progress with increasingly powerful pixel-wise classifiers and incorporating structural priors via Conditional Random Fields (CRF) or Generative Adversarial Networks (GAN).
Ranked #55 on Semantic Segmentation on Cityscapes test
1 code implementation • 17 Apr 2018 • Yongbin Sun, Ziwei Liu, Yue Wang, Sanjay E. Sarma
In this work, we study a new problem, that is, simultaneously recovering 3D shape and surface color from a single image, namely "colorful 3D reconstruction".
1 code implementation • 5 Jun 2018 • Xin Liu, Huanrui Yang, Ziwei Liu, Linghao Song, Hai Li, Yiran Chen
Successful realization of DPatch also illustrates the intrinsic vulnerability of the modern detector architectures to such patch-based adversarial attacks.
1 code implementation • 20 Jul 2018 • Hang Zhou, Yu Liu, Ziwei Liu, Ping Luo, Xiaogang Wang
Talking face generation aims to synthesize a sequence of face images that correspond to a clip of speech.
6 code implementations • ECCV 2018 • Xiaohang Zhan, Ziwei Liu, Junjie Yan, Dahua Lin, Chen Change Loy
Face recognition has witnessed great progress in recent years, mainly attributed to the high-capacity model designed and the abundant labeled data collected.
1 code implementation • 12 Oct 2018 • Yongbin Sun, Yue Wang, Ziwei Liu, Joshua E. Siegel, Sanjay E. Sarma
Generating 3D point clouds is challenging yet highly desired.
no code implementations • 30 Nov 2018 • Weidong Yin, Ziwei Liu, Chen Change Loy
Geometry-aware flow is able to warp the source face attribute into the target face context and generate a warp-and-blend result.
5 code implementations • CVPR 2019 • Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin
In exploring a more effective approach, we find that the key to a successful instance segmentation cascade is to fully leverage the reciprocal relationship between detection and segmentation.
Ranked #32 on Object Detection on COCO-O
1 code implementation • CVPR 2019 • Xiaohang Zhan, Xingang Pan, Ziwei Liu, Dahua Lin, Chen Change Loy
Instead of explicitly modeling the motion probabilities, we design the pretext task as a conditional motion propagation problem.
2 code implementations • CVPR 2019 • Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, Stella X. Yu
We define Open Long-Tailed Recognition (OLTR) as learning from such naturally distributed data and optimizing the classification accuracy over a balanced test set which include head, tail, and open classes.
3 code implementations • ICCV 2019 • Jiaqi Wang, Kai Chen, Rui Xu, Ziwei Liu, Chen Change Loy, Dahua Lin
CARAFE introduces little computational overhead and can be readily integrated into modern network architectures.
Ranked #3 on Feature Upsampling on ImageNet
144 code implementations • 17 Jun 2019 • Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, Zheng Zhang, Dazhi Cheng, Chenchen Zhu, Tianheng Cheng, Qijie Zhao, Buyu Li, Xin Lu, Rui Zhu, Yue Wu, Jifeng Dai, Jingdong Wang, Jianping Shi, Wanli Ouyang, Chen Change Loy, Dahua Lin
In this paper, we introduce the various features of this toolbox.
7 code implementations • CVPR 2020 • Cheng-Han Lee, Ziwei Liu, Lingyun Wu, Ping Luo
To overcome these drawbacks, we propose a novel framework termed MaskGAN, enabling diverse and interactive face manipulation.
2 code implementations • 5 Aug 2019 • Yunxuan Zhang, Siwei Zhang, Yue He, Cheng Li, Chen Change Loy, Ziwei Liu
However, in real-world scenario end-users often only have one target face at hand, rendering existing methods inapplicable.
1 code implementation • ICCV 2019 • Yu Rong, Ziwei Liu, Cheng Li, Kaidi Cao, Chen Change Loy
Specifically, we focus on the challenging task of in-the-wild 3D human recovery from single images when paired 3D annotations are not fully available.
no code implementations • CVPR 2020 • Ziwei Liu, Zhongqi Miao, Xingang Pan, Xiaohang Zhan, Dahua Lin, Stella X. Yu, Boqing Gong
A typical domain adaptation approach is to adapt models trained on the annotated data in a source domain (e. g., sunny weather) for achieving high performance on the test data in a target domain (e. g., rainy weather).
no code implementations • ICCV 2019 • Hang Zhou, Ziwei Liu, Xudong Xu, Ping Luo, Xiaogang Wang
Extensive experiments demonstrate that our framework is capable of inpainting realistic and varying audio segments with or without visual contexts.
no code implementations • 18 Nov 2019 • Wu Shi, Tak-Wai Hui, Ziwei Liu, Dahua Lin, Chen Change Loy
Another important observation is that fashion textures are multi-modal.
1 code implementation • CVPR 2020 • Minghao Guo, Yuzhe Yang, Rui Xu, Ziwei Liu, Dahua Lin
Recent advances in adversarial attacks uncover the intrinsic vulnerability of modern deep neural networks.
no code implementations • 11 Mar 2020 • Xin Liu, Yongbin Sun, Ziwei Liu, Dahua Lin
To facilitate a comprehensive study on diverse fashion collocation, we reorganize Amazon Fashion dataset with carefully designed evaluation protocols.
1 code implementation • CVPR 2020 • Hang Zhou, Jihao Liu, Ziwei Liu, Yu Liu, Xiaogang Wang
Though face rotation has achieved rapid progress in recent years, the lack of high-quality paired training data remains a great hurdle for existing methods.
2 code implementations • CVPR 2020 • Xiaohang Zhan, Xingang Pan, Bo Dai, Ziwei Liu, Dahua Lin, Chen Change Loy
This is achieved via Partial Completion Network (PCNet)-mask (M) and -content (C), that learn to recover fractions of object masks and contents, respectively, in a self-supervised manner.
3 code implementations • 18 May 2020 • Xin Liu, Jiancheng Li, Jiaqi Wang, Ziwei Liu
This toolbox supports a wide spectrum of fashion analysis tasks, including Fashion Attribute Prediction, Fashion Recognition and Retrieval, Fashion Landmark Detection, Fashion Parsing and Segmentation and Fashion Compatibility and Recommendation.
2 code implementations • ECCV 2020 • Guodong Xu, Ziwei Liu, Xiaoxiao Li, Chen Change Loy
Knowledge distillation, which involves extracting the "dark knowledge" from a teacher network to guide the learning of a student network, has emerged as an important technique for model compression and transfer learning.
Ranked #32 on Knowledge Distillation on ImageNet
1 code implementation • CVPR 2020 • Xiaohang Zhan, Jiahao Xie, Ziwei Liu, Yew Soon Ong, Chen Change Loy
In this way, labels and the network evolve shoulder-to-shoulder rather than alternatingly.
1 code implementation • 29 Jun 2020 • Yinghao Xu, Ceyuan Yang, Ziwei Liu, Bo Dai, Bolei Zhou
Recent attempts for unsupervised landmark learning leverage synthesized image pairs that are similar in appearance but different in poses.
no code implementations • ECCV 2020 • Huaiyi Huang, Yuqi Zhang, Qingqiu Huang, Zhengkui Guo, Ziwei Liu, Dahua Lin
Place is an important element in visual understanding.
1 code implementation • ECCV 2020 • Qiang Nie, Ziwei Liu, Yun-hui Liu
Learning a good 3D human pose representation is important for human pose related tasks, e. g. human 3D pose estimation and action recognition.
1 code implementation • ECCV 2020 • Tong Wu, Qingqiu Huang, Ziwei Liu, Yu Wang, Dahua Lin
We present a new loss function called Distribution-Balanced Loss for the multi-label recognition problems that exhibit long-tailed class distributions.
Ranked #7 on Long-tail Learning on VOC-MLT
no code implementations • ECCV 2020 • Hang Zhou, Xudong Xu, Dahua Lin, Xiaogang Wang, Ziwei Liu
Stereophonic audio is an indispensable ingredient to enhance human auditory experience.
1 code implementation • ECCV 2020 • Yuanhan Zhang, Zhenfei Yin, Yidong Li, Guojun Yin, Junjie Yan, Jing Shao, Ziwei Liu
The main reason is that current face anti-spoofing datasets are limited in both quantity and diversity.
2 code implementations • CVPR 2021 • Xudong Wang, Ziwei Liu, Stella X. Yu
Unsupervised feature learning has made great strides with contrastive learning based on instance discrimination and invariant mapping, as benchmarked on curated class-balanced datasets.
Contrastive Learning Semi-Supervised Image Classification +2
2 code implementations • CVPR 2021 • Jiaqi Wang, Wenwei Zhang, Yuhang Zang, Yuhang Cao, Jiangmiao Pang, Tao Gong, Kai Chen, Ziwei Liu, Chen Change Loy, Dahua Lin
Instances of head classes dominate a long-tailed dataset and they serve as negative samples of tail categories.
2 code implementations • 26 Aug 2020 • Jiahao Xie, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy
In this work, we present a comprehensive empirical study to better understand the role of inter-image invariance learning from three main constituting components: pseudo-label maintenance, sampling strategy, and decision boundary design.
no code implementations • 28 Aug 2020 • Weidong Yin, Ziwei Liu, Leonid Sigal
To handle the stark difference in input structures, we proposed two separate neural branches to attentively composite the respective (context/person) inputs into shared ``compositional structural space'', which encodes shape, location and appearance information for both context and person structures in a disentangled manner.
2 code implementations • ICLR 2021 • Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu, Stella X. Yu
We take a dynamic view of the training data and provide a principled model bias and variance analysis as the training data fluctuates: Existing long-tail classifiers invariably increase the model variance and the head-tail model bias gap remains large, due to more and larger confusion with hard negatives for the tail.
Ranked #22 on Long-tail Learning on iNaturalist 2018
1 code implementation • ICLR 2021 • Xingang Pan, Bo Dai, Ziwei Liu, Chen Change Loy, Ping Luo
Through our investigation, we found that such a pre-trained GAN indeed contains rich 3D knowledge and thus can be used to recover 3D shape from a single 2D image in an unsupervised manner.
1 code implementation • CVPR 2021 • Fangzhou Hong, Hui Zhou, Xinge Zhu, Hongsheng Li, Ziwei Liu
2) Dynamic Shifting for complex point distributions.
Ranked #2 on Panoptic Segmentation on SemanticKITTI
no code implementations • 7 Dec 2020 • Jiaqi Wang, Kai Chen, Rui Xu, Ziwei Liu, Chen Change Loy, Dahua Lin
Feature reassembly, i. e. feature downsampling and upsampling, is a key operation in a number of modern convolutional network architectures, e. g., residual networks and feature pyramids.
1 code implementation • 17 Dec 2020 • Guodong Xu, Ziwei Liu, Chen Change Loy
Our goal is to achieve a performance comparable to conventional knowledge distillation with a lower computation cost during training.
1 code implementation • 18 Dec 2020 • Gaurav Kuppa, Andrew Jong, Vera Liu, Ziwei Liu, Teng-Sheng Moh
We build a series of scientific experiments to isolate effective design choices in video synthesis for virtual clothing try-on.
no code implementations • 29 Dec 2020 • Yu Rong, Ziwei Liu, Chen Change Loy
The reason is that most of the current models perform regression based on a single human prototype, which is similar to common poses while far from the rare poses.
no code implementations • ICCV 2021 • Linning Xu, Yuanbo Xiangli, Anyi Rao, Nanxuan Zhao, Bo Dai, Ziwei Liu, Dahua Lin
City modeling is the foundation for computational urban planning, navigation, and entertainment.
no code implementations • ICCV 2021 • Kun Yuan, Quanquan Li, Shaopeng Guo, Dapeng Chen, Aojun Zhou, Fengwei Yu, Ziwei Liu
A standard practice of deploying deep neural networks is to apply the same architecture to all the input instances.
2 code implementations • 18 Feb 2021 • Liming Jiang, Zhengkui Guo, Wayne Wu, Zhaoyang Liu, Ziwei Liu, Chen Change Loy, Shuo Yang, Yuanjun Xiong, Wei Xia, Baoying Chen, Peiyu Zhuang, Sili Li, Shen Chen, Taiping Yao, Shouhong Ding, Jilin Li, Feiyue Huang, Liujuan Cao, Rongrong Ji, Changlei Lu, Ganchao Tan
This paper reports methods and results in the DeeperForensics Challenge 2020 on real-world face forgery detection.
1 code implementation • 25 Feb 2021 • Yuanhan Zhang, Zhenfei Yin, Jing Shao, Ziwei Liu, Shuo Yang, Yuanjun Xiong, Wei Xia, Yan Xu, Man Luo, Jian Liu, Jianshu Li, Zhijun Chen, Mingyu Guo, Hui Li, Junfu Liu, Pengfei Gao, Tianqi Hong, Hao Han, Shijie Liu, Xinhua Chen, Di Qiu, Cheng Zhen, Dashuang Liang, Yufeng Jin, Zhanlong Hao
It is the largest face anti-spoofing dataset in terms of the numbers of the data and the subjects.
2 code implementations • 3 Mar 2021 • Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tao Xiang, Chen Change Loy
Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce.
2 code implementations • CVPR 2021 • Yinan He, Bei Gan, Siyu Chen, Yichun Zhou, Guojun Yin, Luchuan Song, Lu Sheng, Jing Shao, Ziwei Liu
To counter this emerging threat, we construct the ForgeryNet dataset, an extremely large face forgery dataset with unified annotations in image- and video-level data across four tasks: 1) Image Forgery Classification, including two-way (real / fake), three-way (real / fake with identity-replaced forgery approaches / fake with identity-remained forgery approaches), and n-way (real and 15 respective forgery approaches) classification.
3 code implementations • ICCV 2021 • Kun Yuan, Shaopeng Guo, Ziwei Liu, Aojun Zhou, Fengwei Yu, Wei Wu
Motivated by the success of Transformers in natural language processing (NLP) tasks, there emerge some attempts (e. g., ViT and DeiT) to apply Transformers to the vision domain.
Ranked #2 on Image Classification on Oxford-IIIT Pets
1 code implementation • CVPR 2021 • Tong Wu, Ziwei Liu, Qingqiu Huang, Yu Wang, Dahua Lin
We then perform a systematic study on existing long-tailed recognition methods in conjunction with the adversarial training framework.
1 code implementation • CVPR 2021 • Li SiYao, Shiyu Zhao, Weijiang Yu, Wenxiu Sun, Dimitris N. Metaxas, Chen Change Loy, Ziwei Liu
In the animation industry, cartoon videos are usually produced at low frame rate since hand drawing of such frames is costly and time-consuming.
no code implementations • CVPR 2021 • Xudong Xu, Hang Zhou, Ziwei Liu, Bo Dai, Xiaogang Wang, Dahua Lin
Moreover, combined with binaural recordings, our method is able to further boost the performance of binaural audio generation under supervised settings.
1 code implementation • CVPR 2021 • Liang Pan, Xinyi Chen, Zhongang Cai, Junzhe Zhang, Haiyu Zhao, Shuai Yi, Ziwei Liu
In particular, we propose a dual-path architecture to enable principled probabilistic modeling across partial and complete clouds.
Ranked #2 on Point Cloud Completion on Completion3D
1 code implementation • CVPR 2021 • Hang Zhou, Yasheng Sun, Wayne Wu, Chen Change Loy, Xiaogang Wang, Ziwei Liu
While speech content information can be defined by learning the intrinsic synchronization between audio-visual modalities, we identify that a pose code will be complementarily learned in a modulated convolution-based reconstruction framework.
1 code implementation • 5 May 2021 • Zhongqi Miao, Ziwei Liu, Kaitlyn M. Gaynor, Meredith S. Palmer, Stella X. Yu, Wayne M. Getz
Camera trapping is increasingly used to monitor wildlife, but this technology typically requires extensive data annotation.
2 code implementations • 1 Jun 2021 • Kaiyang Zhou, Chen Change Loy, Ziwei Liu
We find that the DG methods, which by design are unable to handle unlabeled data, perform poorly with limited labels in SSDG; the SSL methods, especially FixMatch, obtain much better results but are still far away from the basic vanilla model trained using full labels.
1 code implementation • CVPR 2021 • Yuming Jiang, Kelvin C. K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu
However, performing local transfer is difficult because of two gaps between input and reference images: the transformation gap (e. g. scale and rotation) and the resolution gap (e. g. HR and LR).
1 code implementation • CVPR 2022 • Chongzhi Zhang, Mingyuan Zhang, Shanghang Zhang, Daisheng Jin, Qiang Zhou, Zhongang Cai, Haiyu Zhao, Xianglong Liu, Ziwei Liu
By comprehensively investigating these GE-ViTs and comparing with their corresponding CNN models, we observe: 1) For the enhanced model, larger ViTs still benefit more for the OOD generalization.
1 code implementation • NeurIPS 2021 • Jiahao Xie, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy
Extensive experiments on COCO show that ORL significantly improves the performance of self-supervised learning on scene images, even surpassing supervised ImageNet pre-training on several downstream tasks.
1 code implementation • ICCV 2021 • Zhipeng Luo, Zhongang Cai, Changqing Zhou, Gongjie Zhang, Haiyu Zhao, Shuai Yi, Shijian Lu, Hongsheng Li, Shanghang Zhang, Ziwei Liu
In addition, existing 3D domain adaptive detection methods often assume prior access to the target domain annotations, which is rarely feasible in the real world.
no code implementations • ICCV 2021 • Yezhen Wang, Bo Li, Tong Che, Kaiyang Zhou, Ziwei Liu, Dongsheng Li
Confidence calibration is of great importance to the reliability of decisions made by machine learning systems.
2 code implementations • ICCV 2021 • Jingkang Yang, Haoqi Wang, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang, Ziwei Liu
The proposed UDG can not only enrich the semantic knowledge of the model by exploiting unlabeled data in an unsupervised manner, but also distinguish ID/OOD samples to enhance ID classification and OOD detection tasks simultaneously.
Out-of-Distribution Detection Out of Distribution (OOD) Detection
13 code implementations • 2 Sep 2021 • Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
Large pre-trained vision-language models like CLIP have shown great potential in learning representations that are transferable across a wide range of downstream tasks.
Ranked #2 on Few-shot Age Estimation on MORPH Album2
1 code implementation • ICCV 2021 • Yuming Jiang, Ziqi Huang, Xingang Pan, Chen Change Loy, Ziwei Liu
In this work, we propose Talk-to-Edit, an interactive facial editing framework that performs fine-grained attribute manipulation through dialog between the user and the system.
Ranked #1 on Fine-Grained Facial Editing on CelebA-Dialog
no code implementations • 29 Sep 2021 • Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu
Compared to imbalanced and long-tailed classification, imbalanced regression has its unique challenges as the regression label space can be continuous, boundless, and high-dimensional.
no code implementations • 29 Sep 2021 • Guodong Xu, Yuenan Hou, Ziwei Liu, Chen Change Loy
To further enhance the semantic consistency between the teacher and student model, we present another latent-direction-based distillation loss that preserves the semantic relations in latent space.
2 code implementations • ICLR 2022 • Ziyuan Huang, Shiwei Zhang, Liang Pan, Zhiwu Qing, Mingqian Tang, Ziwei Liu, Marcelo H. Ang Jr
This work presents Temporally-Adaptive Convolutions (TAdaConv) for video understanding, which shows that adaptive weight calibration along the temporal dimension is an efficient way to facilitate modelling complex temporal dynamics in videos.
Ranked #67 on Action Recognition on Something-Something V2 (using extra training data)
no code implementations • 14 Oct 2021 • Zhongang Cai, Mingyuan Zhang, Jiawei Ren, Chen Wei, Daxuan Ren, Zhengyu Lin, Haiyu Zhao, Lei Yang, Chen Change Loy, Ziwei Liu
Specifically, we contribute GTA-Human, a large-scale 3D human dataset generated with the GTA-V game engine, featuring a highly diverse set of subjects, actions, and scenarios.
3 code implementations • 21 Oct 2021 • Jingkang Yang, Kaiyang Zhou, Yixuan Li, Ziwei Liu
In this survey, we first present a unified framework called generalized OOD detection, which encompasses the five aforementioned problems, i. e., AD, ND, OSR, OOD detection, and OD.
no code implementations • 1 Nov 2021 • Yu Rong, Jingbo Wang, Ziwei Liu, Chen Change Loy
In this paper, we make the first attempt to reconstruct 3D interacting hands from monocular single RGB images.
1 code implementation • NeurIPS 2021 • Yuhang Cao, Jiaqi Wang, Ying Jin, Tong Wu, Kai Chen, Ziwei Liu, Dahua Lin
1) In the association step, in contrast to implicitly leveraging multiple base classes, we construct a compact novel class feature space via explicitly imitating a specific base class feature space.
no code implementations • 23 Nov 2021 • Qiang Nie, Ziwei Liu, Yunhui Liu
Inspired by this, we propose a new framework that leverages the labeled 3D human poses to learn a 3D concept of the human body to reduce the ambiguity.
1 code implementation • 24 Nov 2021 • Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu, Dahua Lin
We adopt DCD to evaluate the point cloud completion task, where experimental results show that DCD pays attention to both the overall structure and local geometric details and provides a more reliable evaluation even when CD and EMD contradict each other.
1 code implementation • 30 Nov 2021 • Liang Pan, Zhongang Cai, Ziwei Liu
\textbf{3)} Based on a synergy of hierarchical graph networks and graphical modeling, we propose the {H}ierarchical {G}raphical {M}odeling (\textbf{HGM}) architecture to encode robust descriptors consisting of i) a unary term learned from {\textit{RI}} features; and ii) multiple smoothness terms encoded from neighboring point relations at different scales through our TPT modules.
1 code implementation • NeurIPS 2021 • Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu, Dahua Lin
We adopt DCD to evaluate the point cloud completion task, where experimental results show that DCD pays attention to both the overall structure and local geometric details and provides a more reliable evaluation even when CD and EMD contradict each other.
1 code implementation • NeurIPS 2021 • Fangzhou Hong, Liang Pan, Zhongang Cai, Ziwei Liu
The main challenges are two-fold: 1) effective 3D feature learning for fine details, and 2) capture of garment dynamics caused by the interaction between garments and the human body, especially for loose garments like skirts.
no code implementations • 15 Dec 2021 • Yinan He, Lu Sheng, Jing Shao, Ziwei Liu, Zhaofan Zou, Zhizhi Guo, Shan Jiang, Curitis Sun, Guosheng Zhang, Keyao Wang, Haixiao Yue, Zhibin Hong, Wanguo Wang, Zhenyu Li, Qi Wang, Zhenli Wang, Ronghao Xu, Mingwen Zhang, Zhiheng Wang, Zhenhang Huang, Tianming Zhang, Ningning Zhao
The rapid progress of photorealistic synthesis techniques has reached a critical point where the boundary between real and manipulated images starts to blur.
2 code implementations • 22 Dec 2021 • Liang Pan, Tong Wu, Zhongang Cai, Ziwei Liu, Xumin Yu, Yongming Rao, Jiwen Lu, Jie zhou, Mingye Xu, Xiaoyuan Luo, Kexue Fu, Peng Gao, Manning Wang, Yali Wang, Yu Qiao, Junsheng Zhou, Xin Wen, Peng Xiang, Yu-Shen Liu, Zhizhong Han, Yuanjie Yan, Junyi An, Lifa Zhu, Changwei Lin, Dongrui Liu, Xin Li, Francisco Gómez-Fernández, Qinlong Wang, Yang Yang
Based on the MVP dataset, this paper reports methods and results in the Multi-View Partial Point Cloud Challenge 2021 on Completion and Registration.
no code implementations • CVPR 2022 • Han Yang, Xinrui Yu, Ziwei Liu
Virtual try-on aims to transfer a target clothing image onto a reference person.
Ranked #4 on Virtual Try-on on VITON
4 code implementations • 7 Feb 2022 • Jiawei Ren, Liang Pan, Ziwei Liu
3D perception, especially point cloud classification, has achieved substantial progress.
Ranked #7 on Point Cloud Classification on PointCloud-C
1 code implementation • 13 Feb 2022 • Xian Liu, Rui Qian, Hang Zhou, Di Hu, Weiyao Lin, Ziwei Liu, Bolei Zhou, Xiaowei Zhou
Specifically, we observe that the previous practice of learning only a single audio representation is insufficient due to the additive nature of audio signals.
1 code implementation • CVPR 2022 • Ziang Cao, Ziyuan Huang, Liang Pan, Shiwei Zhang, Ziwei Liu, Changhong Fu
Temporal contexts among consecutive frames are far from being fully utilized in existing visual trackers.
9 code implementations • CVPR 2022 • Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
With the rise of powerful pre-trained vision-language models like CLIP, it becomes essential to investigate ways to adapt these models to downstream datasets.
Ranked #3 on Prompt Engineering on ImageNet V2
1 code implementation • ICLR 2022 • Haotong Qin, Yifu Ding, Mingyuan Zhang, Qinghua Yan, Aishan Liu, Qingqing Dang, Ziwei Liu, Xianglong Liu
The large pre-trained BERT has achieved remarkable performance on Natural Language Processing (NLP) tasks but is also computation and memory expensive.
1 code implementation • 14 Mar 2022 • Fangzhou Hong, Hui Zhou, Xinge Zhu, Hongsheng Li, Ziwei Liu
In this work, we address the task of LiDAR-based panoptic segmentation, which aims to parse both objects and scenes in a unified manner.
2 code implementations • 15 Mar 2022 • Yuanhan Zhang, Qinghong Sun, Yichun Zhou, Zexin He, Zhenfei Yin, Kun Wang, Lu Sheng, Yu Qiao, Jing Shao, Ziwei Liu
This work thus proposes a novel active learning framework for realistic dataset annotation.
Ranked #1 on Image Classification on Food-101 (using extra training data)
no code implementations • 16 Mar 2022 • Yinan He, Gengshi Huang, Siyu Chen, Jianing Teng, Wang Kun, Zhenfei Yin, Lu Sheng, Ziwei Liu, Yu Qiao, Jing Shao
2) Squeeze Stage: X-Learner condenses the model to a reasonable size and learns the universal and generalizable representation for various tasks transferring.
1 code implementation • CVPR 2022 • Li SiYao, Weijiang Yu, Tianpei Gu, Chunze Lin, Quan Wang, Chen Qian, Chen Change Loy, Ziwei Liu
With the learned choreographic memory, dance generation is realized on the quantized units that meet high choreography standards, such that the generated dancing sequences are confined within the spatial constraints.
Ranked #1 on Motion Synthesis on AIST++
1 code implementation • CVPR 2022 • Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy
Recent studies on StyleGAN show high performance on artistic portrait generation by transfer learning with limited data.
no code implementations • 25 Mar 2022 • Xinchi Zhou, Dongzhan Zhou, Wanli Ouyang, Hang Zhou, Ziwei Liu, Di Hu
Recent years have witnessed the success of deep learning on the visual sound separation task.
1 code implementation • CVPR 2022 • Fangzhou Hong, Liang Pan, Zhongang Cai, Ziwei Liu
To tackle the challenges, we design the novel Dense Intra-sample Contrastive Learning and Sparse Structure-aware Contrastive Learning targets by hierarchically learning a modal-invariant latent space featured with continuous and ordinal feature distribution and structure-aware semantic consistency.
1 code implementation • CVPR 2022 • Jiawei Ren, Mingyuan Zhang, Cunjun Yu, Ziwei Liu
Data imbalance exists ubiquitously in real-world visual regressions, e. g., age estimation and pose estimation, hurting the model's generalizability and fairness.
1 code implementation • CVPR 2022 • Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy
In this work, we present a novel framework, Generative Prior-guided UNsupervised Image-to-image Translation (GP-UNIT), to improve the overall quality and applicability of the translation algorithm.
1 code implementation • 11 Apr 2022 • Jingkang Yang, Kaiyang Zhou, Ziwei Liu
In this paper, we take into account both shift types and introduce full-spectrum OOD (FS-OOD) detection, a more realistic problem setting that considers both detecting semantic shift and being tolerant to covariate shift; and designs three benchmarks.
Out-of-Distribution Detection Out of Distribution (OOD) Detection
no code implementations • 12 Apr 2022 • Haonan Qiu, Siyu Chen, Bei Gan, Kun Wang, Huafeng Shi, Jing Shao, Ziwei Liu
Notably, our method is also validated to be robust to choices of majority and minority forgery approaches.
4 code implementations • 25 Apr 2022 • Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Chen Qian, Chen Change Loy, Wayne Wu, Ziwei Liu
In addition, a model zoo and human editing applications are demonstrated to facilitate future research in the community.
no code implementations • 27 Apr 2022 • Yuanhan Zhang, Yichao Wu, Zhenfei Yin, Jing Shao, Ziwei Liu
In this work, we attempt to fill this gap by automatically addressing the noise problem from both label and data perspectives in a probabilistic manner.
no code implementations • 28 Apr 2022 • Zhongang Cai, Daxuan Ren, Ailing Zeng, Zhengyu Lin, Tao Yu, Wenjia Wang, Xiangyu Fan, Yang Gao, Yifan Yu, Liang Pan, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu
4D human sensing and modeling are fundamental tasks in vision and graphics with numerous applications.
1 code implementation • 17 May 2022 • Fangzhou Hong, Mingyuan Zhang, Liang Pan, Zhongang Cai, Lei Yang, Ziwei Liu
Our key insight is to take advantage of the powerful vision-language model CLIP for supervising neural human generation, in terms of 3D geometry, texture and animation.
1 code implementation • 19 May 2022 • Xinpeng Ding, Ziwei Liu, Xiaomeng Li
Our key insight is to distill knowledge from publicly available models trained on large generic datasets4 to facilitate the self-supervised learning of surgical videos.
2 code implementations • 31 May 2022 • Yuming Jiang, Shuai Yang, Haonan Qiu, Wayne Wu, Chen Change Loy, Ziwei Liu
In this work, we present a text-driven controllable framework, Text2Human, for a high-quality and diverse human generation.
1 code implementation • 8 Jun 2022 • Bo Li, Yifei Shen, Jingkang Yang, Yezhen Wang, Jiawei Ren, Tong Che, Jun Zhang, Ziwei Liu
It is motivated by an empirical finding that transformer-based models trained with empirical risk minimization (ERM) outperform CNN-based models employing state-of-the-art (SOTA) DG algorithms on multiple DG datasets.
Ranked #11 on Domain Generalization on DomainNet (using extra training data)
1 code implementation • 9 Jun 2022 • Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu
The size of vision models has grown exponentially over the last few years, especially after the emergence of Vision Transformer.
Ranked #1 on Image Classification on OmniBenchmark (using extra training data)
3 code implementations • 15 Jun 2022 • Jiahao Xie, Wei Li, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy
We present Masked Frequency Modeling (MFM), a unified frequency-domain-based approach for self-supervised pre-training of visual models.
2 code implementations • CVPR 2023 • Lingdong Kong, Jiawei Ren, Liang Pan, Ziwei Liu
Densely annotating LiDAR point clouds is costly, which restrains the scalability of fully-supervised learning methods.
1 code implementation • 5 Jul 2022 • Rui Shao, Tianxing Wu, Ziwei Liu
Moreover, we build a comprehensive benchmark and set up rigorous evaluation protocols and metrics for this new research problem.
1 code implementation • 11 Jul 2022 • Long Zhuo, Guangcong Wang, Shikai Li, Wayne Wu, Ziwei Liu
In this paper, we present a spatial-temporal compression framework, \textbf{Fast-Vid2Vid}, which focuses on data aspects of generative models.
1 code implementation • 14 Jul 2022 • Yuanhan Zhang, Zhenfei Yin, Jing Shao, Ziwei Liu
We benchmark ReCo and other advances in omni-vision representation studies that are different in architectures (from CNNs to transformers) and in learning paradigms (from supervised learning to self-supervised learning) on OmniBenchmark.
1 code implementation • 14 Jul 2022 • Zhaoxi Chen, Ziwei Liu
Our key insight is that the space-time varying geometry and reflectance of the human body can be decomposed as a set of neural fields of normal, occlusion, diffuse, and specular maps.
1 code implementation • 20 Jul 2022 • Shenhan Qian, Jiale Xu, Ziwei Liu, Liqian Ma, Shenghua Gao
We propose united implicit functions (UNIF), a part-based method for clothed human reconstruction and animation with raw scans and skeletons as the input.
1 code implementation • 22 Jul 2022 • Jingkang Yang, Yi Zhe Ang, Zujin Guo, Kaiyang Zhou, Wayne Zhang, Ziwei Liu
Existing research addresses scene graph generation (SGG) -- a critical technology for scene understanding in images -- from a detection perspective, i. e., objects are detected using bounding boxes followed by prediction of their pairwise relationships.
Ranked #5 on Panoptic Scene Graph Generation on PSG Dataset
1 code implementation • 25 Jul 2022 • Hao Zhu, Wayne Wu, Wentao Zhu, Liming Jiang, Siwei Tang, Li Zhang, Ziwei Liu, Chen Change Loy
Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields.
Ranked #1 on Unconditional Video Generation on CelebV-HQ
no code implementations • 27 Jul 2022 • Jianshu Li, Man Luo, Jian Liu, Tao Chen, Chengjie Wang, Ziwei Liu, Shuo Liu, Kewei Yang, Xuning Shao, Kang Chen, Boyuan Liu, Mingyu Guo, Ying Guo, Yingying Ao, Pengfei Gao
In this paper, we present the solutions from the Top 3 teams, in order to boost the research work in the field of image forgery detection.
1 code implementation • 29 Jul 2022 • Guangcong Wang, Yinuo Yang, Chen Change Loy, Ziwei Liu
To tackle this problem, we propose a coupled dual-StyleGAN panorama synthesis network (StyleLight) that integrates LDR and HDR panorama synthesis into a unified framework.
1 code implementation • 10 Aug 2022 • Zhipeng Luo, Changqing Zhou, Liang Pan, Gongjie Zhang, Tianrui Liu, Yueru Luo, Haiyu Zhao, Ziwei Liu, Shijian Lu
In a point cloud sequence, 3D object tracking aims to predict the location and orientation of an object in consecutive frames given an object template.
1 code implementation • 16 Aug 2022 • Haonan Qiu, Yuming Jiang, Hang Zhou, Wayne Wu, Ziwei Liu
Notably, StyleFaceV is capable of generating realistic $1024\times1024$ face videos even without high-resolution training videos.
no code implementations • 17 Aug 2022 • Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, Stella X. Yu
A practical recognition system must balance between majority (head) and minority (tail) classes, generalize across the distribution, and acknowledge novelty upon the instances of unseen classes (open classes).
1 code implementation • 18 Aug 2022 • Guodong Xu, Yuenan Hou, Ziwei Liu, Chen Change Loy
To further enhance the semantic consistency between the teacher and student model, we present a latent-direction-based distillation loss that preserves the semantic relations in latent space.
1 code implementation • 26 Aug 2022 • Tong Wu, Jiaqi Wang, Xingang Pan, Xudong Xu, Christian Theobalt, Ziwei Liu, Dahua Lin
Previous methods based on neural volume rendering mostly train a fully implicit model with MLPs, which typically require hours of training for a single scene.
2 code implementations • 31 Aug 2022 • Mingyuan Zhang, Zhongang Cai, Liang Pan, Fangzhou Hong, Xinying Guo, Lei Yang, Ziwei Liu
Instead of a deterministic language-motion mapping, MotionDiffuse generates motions through a series of denoising steps in which variations are injected.
Ranked #17 on Motion Synthesis on KIT Motion-Language
2 code implementations • 15 Sep 2022 • Kaiyang Zhou, Yuanhan Zhang, Yuhang Zang, Jingkang Yang, Chen Change Loy, Ziwei Liu
Another interesting observation is that the teacher-student gap on out-of-distribution data is bigger than that on in-distribution data, which highlights the capacity mismatch issue as well as the shortcoming of KD.
1 code implementation • 20 Sep 2022 • Zhaoxi Chen, Guangcong Wang, Ziwei Liu
To achieve super-resolution inverse tone mapping, we derive a continuous representation of 360-degree imaging from the LDR panorama as a set of structured latent codes anchored to the sphere.
1 code implementation • 21 Sep 2022 • Hui En Pang, Zhongang Cai, Lei Yang, Tianwei Zhang, Ziwei Liu
Experiments with 10 backbones, ranging from CNNs to transformers, show the knowledge learnt from a proximity task is readily transferable to human mesh recovery.
1 code implementation • 22 Sep 2022 • Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy
Although a series of successful portrait image toonification models built upon the powerful StyleGAN have been proposed, these image-oriented methods have obvious limitations when applied to videos, such as the fixed frame size, the requirement of face alignment, missing non-facial details and temporal inconsistency.
no code implementations • 27 Sep 2022 • Zhiliang Xu, Hang Zhou, Zhibin Hong, Ziwei Liu, Jiaming Liu, Zhizhi Guo, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang
Our core idea is to leverage a style-based generator to empower high-fidelity and robust face swapping, thus the generator's advantage can be adopted for optimizing identity similarity.
1 code implementation • 4 Oct 2022 • Xiaomeng Li, Hongyu Ren, Huifeng Yao, Ziwei Liu
In this paper, we propose TripleE, and the main idea is to encourage the network to focus on training on subsets (learning with replay) and enlarge the data space in learning on subsets.
1 code implementation • 10 Oct 2022 • Fangzhou Hong, Zhaoxi Chen, Yushi Lan, Liang Pan, Ziwei Liu
At the core of EVA3D is a compositional human NeRF representation, which divides the human body into local parts.
3 code implementations • 13 Oct 2022 • Jingkang Yang, Pengyun Wang, Dejian Zou, Zitang Zhou, Kunyuan Ding, Wenxuan Peng, Haoqi Wang, Guangyao Chen, Bo Li, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Dan Hendrycks, Yixuan Li, Ziwei Liu
Out-of-distribution (OOD) detection is vital to safety-critical machine learning applications and has thus been extensively studied, with a plethora of methods developed in the literature.
no code implementations • 27 Oct 2022 • Zhendong Li, Wen Chen, Ziwei Liu, Hongying Tang, Jianmin Lu
We formulate a total energy consumption minimization problem by a joint optimization of subcarrier allocation, task input bits, time slot allocation, transmit power allocation and RMS transmissive coefficient while taking into account the constraints of communication resources and computing resources.
1 code implementation • 10 Nov 2022 • Li SiYao, Yuhang Li, Bo Li, Chao Dong, Ziwei Liu, Chen Change Loy
Existing correspondence datasets for two-dimensional (2D) cartoon suffer from simple frame composition and monotonic movements, making them insufficient to simulate real animations.
no code implementations • 5 Dec 2022 • Xian Liu, Qianyi Wu, Hang Zhou, Yuanqi Du, Wayne Wu, Dahua Lin, Ziwei Liu
Our key insight is that the co-speech gestures can be decomposed into common motion patterns and subtle rhythmic dynamics.
no code implementations • 9 Dec 2022 • Yasheng Sun, Hang Zhou, Kaisiyuan Wang, Qianyi Wu, Zhibin Hong, Jingtuo Liu, Errui Ding, Jingdong Wang, Ziwei Liu, Hideki Koike
This requires masking a large percentage of the original image and seamlessly inpainting it with the aid of audio and reference frames.
1 code implementation • 19 Dec 2022 • Yuming Jiang, Kelvin C. K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu
To tackle these challenges, we propose C2-Matching in this work, which performs explicit robust matching crossing transformation and resolution.
no code implementations • ICCV 2023 • Junzhe Zhang, Yushi Lan, Shuai Yang, Fangzhou Hong, Quan Wang, Chai Kiat Yeo, Ziwei Liu, Chen Change Loy
In this paper, we address the challenging problem of 3D toonification, which involves transferring the style of an artistic domain onto a target 3D face with stylized geometry and texture.
no code implementations • CVPR 2023 • Peng Wang, YuAn Liu, Zhaoxi Chen, Lingjie Liu, Ziwei Liu, Taku Komura, Christian Theobalt, Wenping Wang
Existing fast grid-based NeRF training frameworks, like Instant-NGP, Plenoxels, DVGO, or TensoRF, are mainly designed for bounded scenes and rely on space warping to handle unbounded scenes.
1 code implementation • CVPR 2023 • Tong Wu, Jiarui Zhang, Xiao Fu, Yuxin Wang, Jiawei Ren, Liang Pan, Wayne Wu, Lei Yang, Jiaqi Wang, Chen Qian, Dahua Lin, Ziwei Liu
Recent advances in modeling 3D objects mostly rely on synthetic datasets due to the lack of large-scale realscanned 3D databases.
1 code implementation • 26 Jan 2023 • Haotong Qin, Mingyuan Zhang, Yifu Ding, Aoyu Li, Zhongang Cai, Ziwei Liu, Fisher Yu, Xianglong Liu
Network binarization emerges as one of the most promising compression approaches offering extraordinary computation and memory savings by minimizing the bit-width.
1 code implementation • NeurIPS 2023 • Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu
To overcome the problem, we propose a prompt retrieval framework to automate the selection of in-context examples.
1 code implementation • 2 Feb 2023 • Zhaoxi Chen, Guangcong Wang, Ziwei Liu
Our approach begins with an efficient bird's-eye-view (BEV) representation generated from simplex noise, which includes a height field for surface elevation and a semantic field for detailed scene semantics.
Ranked #3 on Scene Generation on GoogleEarth
3 code implementations • 7 Feb 2023 • Da-Wei Zhou, Qi-Wei Wang, Zhi-Hong Qi, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu
Deep models, e. g., CNNs and Vision Transformers, have achieved impressive achievements in many vision tasks in the closed world.
no code implementations • 14 Feb 2023 • Yasheng Sun, Qianyi Wu, Hang Zhou, Kaisiyuan Wang, Tianshu Hu, Chen-Chieh Liao, Shio Miyafuji, Ziwei Liu, Hideki Koike
Creating the photo-realistic version of people sketched portraits is useful to various entertainment purposes.
no code implementations • 18 Feb 2023 • Ce Zhou, Qian Li, Chen Li, Jun Yu, Yixin Liu, Guangjing Wang, Kai Zhang, Cheng Ji, Qiben Yan, Lifang He, Hao Peng, JianXin Li, Jia Wu, Ziwei Liu, Pengtao Xie, Caiming Xiong, Jian Pei, Philip S. Yu, Lichao Sun
This study provides a comprehensive review of recent research advancements, challenges, and opportunities for PFMs in text, image, graph, as well as other data modalities.
no code implementations • ICCV 2023 • Lingdong Kong, Youquan Liu, Runnan Chen, Yuexin Ma, Xinge Zhu, Yikang Li, Yuenan Hou, Yu Qiao, Ziwei Liu
We show that, for the first time, a range view method is able to surpass the point, voxel, and multi-view fusion counterparts in the competing LiDAR semantic and panoptic segmentation benchmarks, i. e., SemanticKITTI, nuScenes, and ScribbleKITTI.
Ranked #4 on 3D Semantic Segmentation on SemanticKITTI
1 code implementation • ICCV 2023 • Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy
Recent advances in face manipulation using StyleGAN have produced impressive results.
2 code implementations • 13 Mar 2023 • Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu
ADAM is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM's generalizability and adapted model's adaptivity.
1 code implementation • CVPR 2023 • Lingting Zhu, Xian Liu, Xuanyu Liu, Rui Qian, Ziwei Liu, Lequan Yu
In this work, we propose a novel diffusion-based framework, named Diffusion Co-Speech Gesture (DiffGesture), to effectively capture the cross-modal audio-to-gesture associations and preserve temporal coherence for high-fidelity audio-driven co-speech gesture generation.
1 code implementation • ICCV 2023 • Shoukang Hu, Fangzhou Hong, Liang Pan, Haiyi Mei, Lei Yang, Ziwei Liu
To this end, we propose a bank of 3D-aware hierarchical features, including global, point-level, and pixel-aligned features, to facilitate informative encoding.
2 code implementations • 23 Mar 2023 • Ziqi Huang, Tianxing Wu, Yuming Jiang, Kelvin C. K. Chan, Ziwei Liu
Specifically, we propose a novel relation-steering contrastive learning scheme to impose two critical properties of the relation prompt: 1) The relation prompt should capture the interaction between objects, enforced by the preposition prior.
no code implementations • 23 Mar 2023 • Ziwei Liu, Yongtao Wang, Xiaojie Chu
Specifically, we propose a learnable nonlinear channel-wise transformation to align the features of the student and the teacher model.
1 code implementation • 28 Mar 2023 • Peng Wang, YuAn Liu, Zhaoxi Chen, Lingjie Liu, Ziwei Liu, Taku Komura, Christian Theobalt, Wenping Wang
Based on our analysis, we further propose a novel space-warping method called perspective warping, which allows us to handle arbitrary trajectories in the grid-based NeRF framework.
no code implementations • ICCV 2023 • Guangcong Wang, Zhaoxi Chen, Chen Change Loy, Ziwei Liu
Since coarse depth maps are not strictly scaled to the ground-truth depth maps, we propose a simple yet effective constraint, a local depth ranking method, on NeRFs such that the expected depth ranking of the NeRF is consistent with that of the coarse depth maps in local patches.
1 code implementation • ICCV 2023 • Zhitao Yang, Zhongang Cai, Haiyi Mei, Shuai Liu, Zhaoxi Chen, Weiye Xiao, Yukun Wei, Zhongfei Qing, Chen Wei, Bo Dai, Wayne Wu, Chen Qian, Dahua Lin, Ziwei Liu, Lei Yang
Synthetic data has emerged as a promising source for 3D human research as it offers low-cost access to large-scale human datasets.
1 code implementation • ICCV 2023 • Lingdong Kong, Youquan Liu, Xin Li, Runnan Chen, Wenwei Zhang, Jiawei Ren, Liang Pan, Kai Chen, Ziwei Liu
The robustness of 3D perception systems under natural corruptions from environments and sensors is pivotal for safety-critical applications.
1 code implementation • ICCV 2023 • Mingyuan Zhang, Xinying Guo, Liang Pan, Zhongang Cai, Fangzhou Hong, Huirong Li, Lei Yang, Ziwei Liu
However, the performance on more diverse motions remains unsatisfactory.
Ranked #1 on Motion Synthesis on KIT Motion-Language
1 code implementation • CVPR 2023 • Rui Shao, Tianxing Wu, Ziwei Liu
In this paper, we highlight a new research problem for multi-modal fake media, namely Detecting and Grounding Multi-Modal Media Manipulation (DGM^4).
2 code implementations • 6 Apr 2023 • Jiawei Ren, Cunjun Yu, Siwei Chen, Xiao Ma, Liang Pan, Ziwei Liu
Motion mimicking is a foundational task in physics-based character animation.
1 code implementation • 13 Apr 2023 • Shaoyuan Xie, Lingdong Kong, Wenwei Zhang, Jiawei Ren, Liang Pan, Kai Chen, Ziwei Liu
Our experiments further demonstrate that pre-training and depth-free BEV transformation has the potential to enhance out-of-distribution robustness.
1 code implementation • ICCV 2023 • Yuming Jiang, Shuai Yang, Tong Liang Koh, Wayne Wu, Chen Change Loy, Ziwei Liu
In this work, we present Text2Performer to generate vivid human videos with articulated motions from texts.
no code implementations • 18 Apr 2023 • Liang Pan, Xinyi Chen, Zhongang Cai, Junzhe Zhang, Haiyu Zhao, Shuai Yi, Ziwei Liu
Existing point cloud completion methods tend to generate global shape skeletons and hence lack fine local details.
2 code implementations • 19 Apr 2023 • Xiangtai Li, Henghui Ding, Haobo Yuan, Wenwei Zhang, Jiangmiao Pang, Guangliang Cheng, Kai Chen, Ziwei Liu, Chen Change Loy
Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various vision processing tasks.
1 code implementation • CVPR 2023 • Ziqi Huang, Kelvin C. K. Chan, Yuming Jiang, Ziwei Liu
In this work, we present Collaborative Diffusion, where pre-trained uni-modal diffusion models collaborate to achieve multi-modal face generation and editing without re-training.
no code implementations • 4 May 2023 • Ziwei Liu, Wen Chen, Zhendong Li, Jinhong Yuan, Qingqing Wu, Kunlun Wang
In this paper, we investigated the downlink transmission problem of a cognitive radio network (CRN) equipped with a novel transmissive reconfigurable intelligent surface (TRIS) transmitter.
1 code implementation • 5 May 2023 • Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Jingkang Yang, Ziwei Liu
Large language models (LLMs) have demonstrated significant universal capabilities as few/zero-shot learners in various tasks due to their pre-training on vast amounts of text data, as exemplified by GPT-3, which boosted to InstrctGPT and ChatGPT, effectively following natural language instructions to accomplish real-world tasks.
Ranked #8 on Visual Question Answering on BenchLMM
no code implementations • CVPR 2023 • Jiazhi Guan, Zhanwang Zhang, Hang Zhou, Tianshu Hu, Kaisiyuan Wang, Dongliang He, Haocheng Feng, Jingtuo Liu, Errui Ding, Ziwei Liu, Jingdong Wang
Despite recent advances in syncing lip movements with any audio waves, current methods still struggle to balance generation quality and the model's generalization ability.
1 code implementation • 18 May 2023 • Shoukang Hu, Kaichen Zhou, Kaiyu Li, Longhui Yu, Lanqing Hong, Tianyang Hu, Zhenguo Li, Gim Hee Lee, Ziwei Liu
In this paper, we propose ConsistentNeRF, a method that leverages depth information to regularize both multi-view and single-view 3D consistency among pixels.
1 code implementation • NeurIPS 2023 • Dongwei Pan, Long Zhuo, Jingtan Piao, Huiwen Luo, Wei Cheng, Yuxin Wang, Siming Fan, Shengqi Liu, Lei Yang, Bo Dai, Ziwei Liu, Chen Change Loy, Chen Qian, Wayne Wu, Dahua Lin, Kwan-Yee Lin
It is a large-scale digital library for head avatars with three key attributes: 1) High Fidelity: all subjects are captured by 60 synchronized, high-resolution 2K cameras in 360 degrees.
1 code implementation • 23 May 2023 • Jun Cen, Yizheng Wu, Kewei Wang, Xingyi Li, Jingkang Yang, Yixuan Pei, Lingdong Kong, Ziwei Liu, Qifeng Chen
The Segment Anything Model (SAM) has demonstrated its effectiveness in segmenting any part of 2D RGB images.
Open Vocabulary Semantic Segmentation Panoptic Segmentation +1
no code implementations • 30 May 2023 • Da-Wei Zhou, Yuanhan Zhang, Jingyi Ning, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu
While traditional CIL methods focus on visual information to grasp core features, recent advances in Vision-Language Models (VLM) have shown promising capabilities in learning generalizable representations with the aid of textual information.
1 code implementation • 1 Jun 2023 • Rui Shao, Tianxing Wu, Liqiang Nie, Ziwei Liu
Unlike existing deepfake detection methods merely focusing on low-level forgery patterns, the forgery detection process of our model can be regularized by generalizable high-level semantics from a pre-trained ViT and adapted by global and local low-level forgeries of deepfake data.
1 code implementation • 7 Jun 2023 • Shuai Yang, Liming Jiang, Ziwei Liu, Chen Change Loy
In this paper, we introduce a novel versatile framework, Generative Prior-guided UNsupervised Image-to-image Translation (GP-UNIT), that improves the quality, applicability and controllability of the existing translation models.
2 code implementations • 8 Jun 2023 • Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Fanyi Pu, Jingkang Yang, Chunyuan Li, Ziwei Liu
We release the MIMIC-IT dataset, instruction-response collection pipeline, benchmarks, and the Otter model.
Ranked #83 on Visual Question Answering on MM-Vet
no code implementations • 13 Jun 2023 • Shuai Yang, Yifan Zhou, Ziwei Liu, Chen Change Loy
The framework includes two parts: key frame translation and full video translation.
2 code implementations • NeurIPS 2023 • Youquan Liu, Lingdong Kong, Jun Cen, Runnan Chen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu
Recent advancements in vision foundation models (VFMs) have opened up new possibilities for versatile and efficient visual perception.
1 code implementation • 15 Jun 2023 • Jingyang Zhang, Jingkang Yang, Pengyun Wang, Haoqi Wang, Yueqian Lin, Haoran Zhang, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Yixuan Li, Ziwei Liu, Yiran Chen, Hai Li
Out-of-Distribution (OOD) detection is critical for the reliable operation of open-world intelligent systems.
Out-of-Distribution Detection Out of Distribution (OOD) Detection
1 code implementation • 26 Jun 2023 • Binzhu Xie, Sicheng Zhang, Zitang Zhou, Bo Li, Yuanhan Zhang, Jack Hessel, Jingkang Yang, Ziwei Liu
Surprising videos, such as funny clips, creative performances, or visual illusions, attract significant attention.
2 code implementations • 12 Jul 2023 • YuAn Liu, Haodong Duan, Yuanhan Zhang, Bo Li, Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, Kai Chen, Dahua Lin
In response to these challenges, we propose MMBench, a novel multi-modality benchmark.
Ranked #1 on Visual Question Answering on MMBench
1 code implementation • 13 Jul 2023 • Yi Wang, Yinan He, Yizhuo Li, Kunchang Li, Jiashuo Yu, Xin Ma, Xinhao Li, Guo Chen, Xinyuan Chen, Yaohui Wang, Conghui He, Ping Luo, Ziwei Liu, Yali Wang, LiMin Wang, Yu Qiao
Specifically, we utilize a multi-scale approach to generate video-related descriptions.
1 code implementation • 17 Jul 2023 • Jinghao Wang, Zhengyu Wen, Xiangtai Li, Zujin Guo, Jingkang Yang, Ziwei Liu
Panoptic Scene Graph (PSG) is a challenging task in Scene Graph Generation (SGG) that aims to create a more comprehensive scene graph representation using panoptic segmentation instead of boxes.
no code implementations • 19 Jul 2023 • Lingting Zhu, Zeyue Xue, Zhenchao Jin, Xian Liu, Jingzhen He, Ziwei Liu, Lequan Yu
This paradigm extends the 2D image diffusion model to a volumetric version with a slightly increasing number of parameters and computation, offering a principled solution for generic cross-modality 3D medical image synthesis.
1 code implementation • ICCV 2023 • Wei Cheng, Ruixiang Chen, Wanqi Yin, Siming Fan, Keyu Chen, Honglin He, Huiwen Luo, Zhongang Cai, Jingbo Wang, Yang Gao, Zhengming Yu, Zhengyu Lin, Daxuan Ren, Lei Yang, Ziwei Liu, Chen Change Loy, Chen Qian, Wayne Wu, Dahua Lin, Bo Dai, Kwan-Yee Lin
Realistic human-centric rendering plays a key role in both computer vision and computer graphics.
no code implementations • 25 Jul 2023 • Bo Li, Haotian Liu, Liangyu Chen, Yong Jae Lee, Chunyuan Li, Ziwei Liu
Advancements in large pre-trained generative models have expanded their potential as effective data generators in visual recognition.
1 code implementation • 10 Aug 2023 • Ziyuan Huang, Shiwei Zhang, Liang Pan, Zhiwu Qing, Yingya Zhang, Ziwei Liu, Marcelo H. Ang Jr
Spatial convolutions are extensively used in numerous deep video models.
Ranked #3 on Action Recognition on EPIC-KITCHENS-100 (using extra training data)
1 code implementation • 14 Aug 2023 • Weichen Fan, Jinghuan Chen, Ziwei Liu
In this work, we propose Hierarchy Flow, a novel flow-based model to achieve better content preservation during translation.
1 code implementation • 15 Aug 2023 • Yan Tai, Weichen Fan, Zhao Zhang, Feng Zhu, Rui Zhao, Ziwei Liu
The ability to learn from context with novel concepts, and deliver appropriate responses are essential in human conversations.
no code implementations • 18 Aug 2023 • Shoukang Hu, Fangzhou Hong, Tao Hu, Liang Pan, Haiyi Mei, Weiye Xiao, Lei Yang, Ziwei Liu
In this work, we propose HumanLiff, the first layer-wise 3D human generative model with a unified diffusion process.
1 code implementation • 20 Aug 2023 • Ziang Cao, Ziyuan Huang, Liang Pan, Shiwei Zhang, Ziwei Liu, Changhong Fu
To handle those problems, we propose a two-level framework (TCTrack) that can exploit temporal contexts efficiently.
no code implementations • 28 Aug 2023 • Zhongang Cai, Liang Pan, Chen Wei, Wanqi Yin, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu
To tackle these challenges, we propose a principled framework, PointHPS, for accurate 3D HPS from point clouds captured in real-world settings, which iteratively refines point features through a cascaded architecture.
1 code implementation • 1 Sep 2023 • Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu
3D city generation is a desirable yet challenging task, since humans are more sensitive to structural distortions in urban environments.
Ranked #1 on Scene Generation on OSM