no code implementations • 27 Nov 2024 • Wentao Wang, Hang Ye, Fangzhou Hong, Xue Yang, Jianfu Zhang, Yizhou Wang, Ziwei Liu, Liang Pan
2) With the help of the pretrained human prior models, the Geometry Initialization-&-Sculpting pipeline is leveraged to recover high-quality 3D human geometry given a single image.
no code implementations • 12 Nov 2024 • Yushi Lan, Shangchen Zhou, Zhaoyang Lyu, Fangzhou Hong, Shuai Yang, Bo Dai, Xingang Pan, Chen Change Loy
While 3D content generation has advanced significantly, existing methods still face challenges with input formats, latent space design, and output representations.
no code implementations • 26 Sep 2024 • Fangzhou Hong, Vladimir Guzov, Hyo Jin Kim, Yuting Ye, Richard Newcombe, Ziwei Liu, Lingni Ma
Multi-modal sensor inputs are encoded and projected to the joint latent space of language models, and used to prompt motion generation or text generation for egomotion tracking or understanding, respectively.
no code implementations • 20 Sep 2024 • Vladimir Guzov, Yifeng Jiang, Fangzhou Hong, Gerard Pons-Moll, Richard Newcombe, C. Karen Liu, Yuting Ye, Lingni Ma
This paper investigates the online generation of realistic full-body human motion using a single head-mounted device with an outward-facing color camera and the ability to perform visual SLAM.
1 code implementation • 19 Sep 2024 • Zhaoxi Chen, Jiaxiang Tang, Yuhao Dong, Ziang Cao, Fangzhou Hong, Yushi Lan, Tengfei Wang, Haozhe Xie, Tong Wu, Shunsuke Saito, Liang Pan, Dahua Lin, Ziwei Liu
The increasing demand for high-quality 3D assets across various industries necessitates efficient and automated 3D content creation.
no code implementations • 14 Jun 2024 • Lingni Ma, Yuting Ye, Fangzhou Hong, Vladimir Guzov, Yifeng Jiang, Rowan Postyeni, Luis Pesqueira, Alexander Gamino, Vijay Baiyya, Hyo Jin Kim, Kevin Bailey, David Soriano Fosas, C. Karen Liu, Ziwei Liu, Jakob Engel, Renzo De Nardi, Richard Newcombe
To the best of our knowledge, Nymeria dataset is the world's largest collection of human motion in the wild; first of its kind to provide synchronized and localized multi-device multimodal egocentric data; and the world's largest motion-language dataset.
1 code implementation • 10 Jun 2024 • Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu
Recently 3D Gaussian Splatting (3D-GS) has emerged as a highly efficient alternative for object-level 3D generation.
Ranked #1 on Scene Generation on KITTI
3 code implementations • NeurIPS 2023 • Jingkang Yang, Jun Cen, Wenxuan Peng, Shuai Liu, Fangzhou Hong, Xiangtai Li, Kaiyang Zhou, Qifeng Chen, Ziwei Liu
To facilitate research in this new area, we build a richly annotated PSG-4D dataset consisting of 3K RGB-D videos with a total of 1M frames, each of which is labeled with 4D panoptic segmentation masks as well as fine-grained, dynamic scene graphs.
1 code implementation • 13 May 2024 • Ziang Cao, Fangzhou Hong, Tong Wu, Liang Pan, Ziwei Liu
Therefore, we introduce a diffusion-based feed-forward framework to address these challenges with a single model.
no code implementations • 2 Apr 2024 • Tao Hu, Fangzhou Hong, Zhaoxi Chen, Ziwei Liu
FashionEngine automates the 3D human production with three key components: 1) A pre-trained 3D human diffusion model that learns to model 3D humans in a semantic UV latent space from 2D image training data, which provides strong priors for diverse generation and editing tasks.
no code implementations • 1 Apr 2024 • Tao Hu, Fangzhou Hong, Ziwei Liu
2) A structured 3D-aware auto-decoder that factorizes the global latent space into several semantic body parts parameterized by a set of conditional structured local NeRFs anchored to the body template, which embeds the properties learned from the 2D training data and can be decoded to render view-consistent humans under different poses and clothing styles.
no code implementations • CVPR 2024 • Tao Hu, Fangzhou Hong, Ziwei Liu
2) Physical motion decoding that is designed to encourage physical motion learning by decoding the motion triplane features at timestep t to predict both spatial derivatives and temporal derivatives at the next timestep t+1 in the training stage.
no code implementations • 1 Apr 2024 • Mingyuan Zhang, Daisheng Jin, Chenyang Gu, Fangzhou Hong, Zhongang Cai, Jingfang Huang, Chongzhi Zhang, Xinying Guo, Lei Yang, Ying He, Ziwei Liu
In this work, we present Large Motion Model (LMM), a motion-centric, multi-modal framework that unifies mainstream motion generation tasks into a generalist model.
1 code implementation • 18 Mar 2024 • Yushi Lan, Fangzhou Hong, Shuai Yang, Shangchen Zhou, Xuyi Meng, Bo Dai, Xingang Pan, Chen Change Loy
The latent is decoded by a transformer-based decoder into a high-capacity 3D neural field.
1 code implementation • 4 Mar 2024 • Fangzhou Hong, Jiaxiang Tang, Ziang Cao, Min Shi, Tong Wu, Zhaoxi Chen, Shuai Yang, Tengfei Wang, Liang Pan, Dahua Lin, Ziwei Liu
Specifically, it is powered by a text-conditioned tri-plane latent diffusion model, which quickly generates coarse 3D samples for fast prototyping.
1 code implementation • NeurIPS 2023 • Zhaoxi Chen, Fangzhou Hong, Haiyi Mei, Guangcong Wang, Lei Yang, Ziwei Liu
We present PrimDiffusion, the first diffusion-based framework for 3D human generation.
1 code implementation • 14 Sep 2023 • Ziang Cao, Fangzhou Hong, Tong Wu, Liang Pan, Ziwei Liu
To this end, we propose a novel triplane-based 3D-aware Diffusion model with TransFormer, DiffTF, for handling challenges via three aspects.
1 code implementation • 8 Sep 2023 • Junzhe Zhang, Yushi Lan, Shuai Yang, Fangzhou Hong, Quan Wang, Chai Kiat Yeo, Ziwei Liu, Chen Change Loy
In this paper, we address the challenging problem of 3D toonification, which involves transferring the style of an artistic domain onto a target 3D face with stylized geometry and texture.
1 code implementation • CVPR 2024 • Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu
3D city generation is a desirable yet challenging task, since humans are more sensitive to structural distortions in urban environments.
Ranked #2 on Scene Generation on GoogleEarth (KID metric)
no code implementations • 28 Aug 2023 • Zhongang Cai, Liang Pan, Chen Wei, Wanqi Yin, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu
To tackle these challenges, we propose a principled framework, PointHPS, for accurate 3D HPS from point clouds captured in real-world settings, which iteratively refines point features through a cascaded architecture.
no code implementations • 18 Aug 2023 • Shoukang Hu, Fangzhou Hong, Tao Hu, Liang Pan, Haiyi Mei, Weiye Xiao, Lei Yang, Ziwei Liu
In this work, we propose HumanLiff, the first layer-wise 3D human generative model with a unified diffusion process.
1 code implementation • ICCV 2023 • Mingyuan Zhang, Xinying Guo, Liang Pan, Zhongang Cai, Fangzhou Hong, Huirong Li, Lei Yang, Ziwei Liu
However, the performance on more diverse motions remains unsatisfactory.
Ranked #1 on Motion Synthesis on KIT Motion-Language
1 code implementation • ICCV 2023 • Shoukang Hu, Fangzhou Hong, Liang Pan, Haiyi Mei, Lei Yang, Ziwei Liu
To this end, we propose a bank of 3D-aware hierarchical features, including global, point-level, and pixel-aligned features, to facilitate informative encoding.
no code implementations • ICCV 2023 • Junzhe Zhang, Yushi Lan, Shuai Yang, Fangzhou Hong, Quan Wang, Chai Kiat Yeo, Ziwei Liu, Chen Change Loy
In this paper, we address the challenging problem of 3D toonification, which involves transferring the style of an artistic domain onto a target 3D face with stylized geometry and texture.
1 code implementation • 10 Oct 2022 • Fangzhou Hong, Zhaoxi Chen, Yushi Lan, Liang Pan, Ziwei Liu
At the core of EVA3D is a compositional human NeRF representation, which divides the human body into local parts.
2 code implementations • 31 Aug 2022 • Mingyuan Zhang, Zhongang Cai, Liang Pan, Fangzhou Hong, Xinying Guo, Lei Yang, Ziwei Liu
Instead of a deterministic language-motion mapping, MotionDiffuse generates motions through a series of denoising steps in which variations are injected.
Ranked #21 on Motion Synthesis on KIT Motion-Language
1 code implementation • 17 May 2022 • Fangzhou Hong, Mingyuan Zhang, Liang Pan, Zhongang Cai, Lei Yang, Ziwei Liu
Our key insight is to take advantage of the powerful vision-language model CLIP for supervising neural human generation, in terms of 3D geometry, texture and animation.
no code implementations • 28 Apr 2022 • Zhongang Cai, Daxuan Ren, Ailing Zeng, Zhengyu Lin, Tao Yu, Wenjia Wang, Xiangyu Fan, Yang Gao, Yifan Yu, Liang Pan, Fangzhou Hong, Mingyuan Zhang, Chen Change Loy, Lei Yang, Ziwei Liu
4D human sensing and modeling are fundamental tasks in vision and graphics with numerous applications.
1 code implementation • CVPR 2022 • Fangzhou Hong, Liang Pan, Zhongang Cai, Ziwei Liu
To tackle the challenges, we design the novel Dense Intra-sample Contrastive Learning and Sparse Structure-aware Contrastive Learning targets by hierarchically learning a modal-invariant latent space featured with continuous and ordinal feature distribution and structure-aware semantic consistency.
1 code implementation • 14 Mar 2022 • Fangzhou Hong, Hui Zhou, Xinge Zhu, Hongsheng Li, Ziwei Liu
In this work, we address the task of LiDAR-based panoptic segmentation, which aims to parse both objects and scenes in a unified manner.
Ranked #6 on 4D Panoptic Segmentation on SemanticKITTI
1 code implementation • NeurIPS 2021 • Fangzhou Hong, Liang Pan, Zhongang Cai, Ziwei Liu
The main challenges are two-fold: 1) effective 3D feature learning for fine details, and 2) capture of garment dynamics caused by the interaction between garments and the human body, especially for loose garments like skirts.
1 code implementation • 12 Sep 2021 • Xinge Zhu, Hui Zhou, Tai Wang, Fangzhou Hong, Wei Li, Yuexin Ma, Hongsheng Li, Ruigang Yang, Dahua Lin
In this paper, we benchmark our model on these three tasks.
1 code implementation • CVPR 2021 • Fangzhou Hong, Hui Zhou, Xinge Zhu, Hongsheng Li, Ziwei Liu
2) Dynamic Shifting for complex point distributions.
Ranked #2 on Panoptic Segmentation on SemanticKITTI
2 code implementations • CVPR 2021 • Xinge Zhu, Hui Zhou, Tai Wang, Fangzhou Hong, Yuexin Ma, Wei Li, Hongsheng Li, Dahua Lin
However, we found that in the outdoor point cloud, the improvement obtained in this way is quite limited.
Ranked #3 on 3D Semantic Segmentation on ScribbleKITTI
no code implementations • 18 Mar 2020 • Xinhai Liu, Zhizhong Han, Fangzhou Hong, Yu-Shen Liu, Matthias Zwicker
However, due to the irregularity and sparsity in sampled point clouds, it is hard to encode the fine-grained geometry of local regions and their spatial relationships when only using the fixed-size filters and individual local feature integration, which limit the ability to learn discriminative features.