AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

1 code implementation17 May 2022 Fangzhou Hong, Mingyuan Zhang, Liang Pan, Zhongang Cai, Lei Yang, Ziwei Liu

Our key insight is to take advantage of the powerful vision-language model CLIP for supervising neural human generation, in terms of 3D geometry, texture and animation.

Language Modelling motion synthesis +1

Versatile Multi-Modal Pre-Training for Human-Centric Perception

1 code implementation25 Mar 2022 Fangzhou Hong, Liang Pan, Zhongang Cai, Ziwei Liu

To tackle the challenges, we design the novel Dense Intra-sample Contrastive Learning and Sparse Structure-aware Contrastive Learning targets by hierarchically learning a modal-invariant latent space featured with continuous and ordinal feature distribution and structure-aware semantic consistency.

Contrastive Learning Human Parsing +1

Garment4D: Garment Reconstruction from Point Cloud Sequences

1 code implementation NeurIPS 2021 Fangzhou Hong, Liang Pan, Zhongang Cai, Ziwei Liu

The main challenges are two-fold: 1) effective 3D feature learning for fine details, and 2) capture of garment dynamics caused by the interaction between garments and the human body, especially for loose garments like skirts.

PTTR: Relational 3D Point Cloud Object Tracking with Transformer

1 code implementation6 Dec 2021 Changqing Zhou, Zhipeng Luo, Yueru Luo, Tianrui Liu, Liang Pan, Zhongang Cai, Haiyu Zhao, Shijian Lu

In a point cloud sequence, 3D object tracking aims to predict the location and orientation of an object in the current search point cloud given a template point cloud.

3D Object Tracking Object Tracking

Robust Partial-to-Partial Point Cloud Registration in a Full Range

1 code implementation30 Nov 2021 Liang Pan, Zhongang Cai, Ziwei Liu

\textbf{3)} Based on a synergy of hierarchical graph networks and graphical modeling, we propose the {H}ierarchical {G}raphical {M}odeling (\textbf{HGM}) architecture to encode robust descriptors consisting of i) a unary term learned from {\textit{RI}} features; and ii) multiple smoothness terms encoded from neighboring point relations at different scales through our TPT modules.

Graph Matching Point Cloud Registration

Playing for 3D Human Recovery

no code implementations14 Oct 2021 Zhongang Cai, Mingyuan Zhang, Jiawei Ren, Chen Wei, Daxuan Ren, Jiatong Li, Zhengyu Lin, Haiyu Zhao, Shuai Yi, Lei Yang, Chen Change Loy, Ziwei Liu

Image- and video-based 3D human recovery (i. e. pose and shape estimation) have achieved substantial progress.

MeshInversion: 3D textured mesh reconstruction with generative prior

no code implementations29 Sep 2021 Junzhe Zhang, Daxuan Ren, Zhongang Cai, Chai Kiat Yeo, Bo Dai, Chen Change Loy

Reconstruction is achieved by searching for a latent space in the 3D GAN that best resembles the target mesh in accordance with the single view observation.

Unsupervised Domain Adaptive 3D Detection with Multi-Level Consistency

1 code implementation ICCV 2021 Zhipeng Luo, Zhongang Cai, Changqing Zhou, Gongjie Zhang, Haiyu Zhao, Shuai Yi, Shijian Lu, Hongsheng Li, Shanghang Zhang, Ziwei Liu

In addition, existing 3D domain adaptive detection methods often assume prior access to the target domain annotations, which is rarely feasible in the real world.

3D Object Detection Autonomous Driving

Delving Deep into the Generalization of Vision Transformers under Distribution Shifts

1 code implementation14 Jun 2021 Chongzhi Zhang, Mingyuan Zhang, Shanghang Zhang, Daisheng Jin, Qiang Zhou, Zhongang Cai, Haiyu Zhao, Xianglong Liu, Ziwei Liu

By comprehensively investigating these GE-ViTs and comparing with their corresponding CNN models, we observe: 1) For the enhanced model, larger ViTs still benefit more for the OOD generalization.

Out-of-Distribution Generalization Self-Supervised Learning

Unsupervised 3D Shape Completion through GAN Inversion

no code implementations CVPR 2021 Junzhe Zhang, Xinyi Chen, Zhongang Cai, Liang Pan, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Bo Dai, Chen Change Loy

In contrast to previous fully supervised approaches, in this paper we present ShapeInversion, which introduces Generative Adversarial Network (GAN) inversion to shape completion for the first time.

Variational Relational Point Completion Network

1 code implementation CVPR 2021 Liang Pan, Xinyi Chen, Zhongang Cai, Junzhe Zhang, Haiyu Zhao, Shuai Yi, Ziwei Liu

In particular, we propose a dual-path architecture to enable principled probabilistic modeling across partial and complete clouds.

Point Cloud Completion

REFINE: Prediction Fusion Network for Panoptic Segmentation

no code implementations15 Dec 2020 Jiawei Ren, Cunjun Yu, Zhongang Cai, Mingyuan Zhang, Chongsong Chen, Haiyu Zhao, Shuai Yi, Hongsheng Li

Panoptic segmentation aims at generating pixel-wise class and instance predictions for each pixel in the input image, which is a challenging task and far more complicated than naively fusing the semantic and instance segmentation results.

Instance Segmentation Panoptic Segmentation

BiPointNet: Binary Neural Network for Point Clouds

1 code implementation ICLR 2021 Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Liu, Hao Su

To alleviate the resource constraint for real-time point cloud applications that run on edge devices, in this paper we present BiPointNet, the first model binarization approach for efficient deep learning on point clouds.


Leveraging Localization for Multi-camera Association

no code implementations7 Aug 2020 Zhongang Cai, Cunjun Yu, Junzhe Zhang, Jiawei Ren, Haiyu Zhao

We present McAssoc, a deep learning approach to the as-sociation of detection bounding boxes in different views ofa multi-camera system.

MessyTable: Instance Association in Multiple Camera Views

no code implementations ECCV 2020 Zhongang Cai, Junzhe Zhang, Daxuan Ren, Cunjun Yu, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Chen Change Loy

We present an interesting and challenging dataset that features a large number of scenes with messy tables captured from multiple camera views.

Leveraging Temporal Information for 3D Detection and Domain Adaptation

1 code implementation30 Jun 2020 Cunjun Yu, Zhongang Cai, Daxuan Ren, Haiyu Zhao

Ever since the prevalent use of the LiDARs in autonomous driving, tremendous improvements have been made to the learning on the point clouds.

Autonomous Driving Domain Adaptation

3D Convolution on RGB-D Point Clouds for Accurate Model-free Object Pose Estimation

no code implementations29 Dec 2018 Zhongang Cai, Cunjun Yu, Quang-Cuong Pham

The conventional pose estimation of a 3D object usually requires the knowledge of the 3D model of the object.

Pose Estimation Robotic Grasping +1

