no code implementations • 25 Feb 2014 • Pengfei Wan, Gene Cheung, Philip A. Chou, Dinei Florencio, Cha Zhang, Oscar C. Au
In texture-plus-depth representation of a 3D scene, depth maps from different camera viewpoints are typically lossily compressed via the classical transform coding / coefficient quantization paradigm.
no code implementations • CVPR 2022 • Zhaoqing Wang, Qiang Li, Guoxin Zhang, Pengfei Wan, Wen Zheng, Nannan Wang, Mingming Gong, Tongliang Liu
By considering the spatial correspondence, dense self-supervised representation learning has achieved superior performance on various dense prediction tasks.
no code implementations • 8 Dec 2021 • Jiayi Guo, Chaoqun Du, Jiangshan Wang, Huijuan Huang, Pengfei Wan, Gao Huang
For Reference-guided Image Synthesis (RIS) tasks, i. e., rendering a source image in the style of another reference image, where assessing the quality of a single generated image is crucial, these metrics are not applicable.
no code implementations • CVPR 2022 • Linfeng Zhang, Xin Chen, Xiaobing Tu, Pengfei Wan, Ning Xu, Kaisheng Ma
Instead of directly distilling the generated images of teachers, wavelet knowledge distillation first decomposes the images into different frequency bands with discrete wavelet transformation and then only distills the high frequency bands.
no code implementations • 30 Mar 2022 • Wanfeng Zheng, Qiang Li, Guoxin Zhang, Pengfei Wan, Zhongyuan Wang
Unpaired image-to-image translation is to translate an image from a source domain to a target domain without paired training data.
no code implementations • 10 Oct 2022 • Wanfeng Zheng, Qiang Li, Xiaoyan Guo, Pengfei Wan, Zhongyuan Wang
More specifically, our efforts consist of three parts: 1) a data-free training strategy to train latent mappers to bridge the latent space of CLIP and StyleGAN; 2) for more precise mapping, temporal relative consistency is proposed to address the knowledge distribution bias problem among different latent spaces; 3) to refine the mapped latent in s space, adaptive style mixing is also proposed.
no code implementations • 29 May 2023 • Mengtian Li, Yi Dong, Minxuan Lin, Haibin Huang, Pengfei Wan, Chongyang Ma
We also introduce a two-stage training strategy, where we train the encoder in the first stage to align the feature maps with StyleGAN and enable a faithful reconstruction of input faces.
no code implementations • 8 Sep 2023 • Haotian Yang, Mingwu Zheng, Wanquan Feng, Haibin Huang, Yu-Kun Lai, Pengfei Wan, Zhongyuan Wang, Chongyang Ma
Specifically, TRAvatar is trained with dynamic image sequences captured in a Light Stage under varying lighting conditions, enabling realistic relighting and real-time animation for avatars in diverse scenes.
no code implementations • 16 Nov 2023 • Ming Chen, Yan Zhou, Weihua Jian, Pengfei Wan, Zhongyuan Wang
Though significant progress in human pose and shape recovery from monocular RGB images has been made in recent years, obtaining 3D human motion with high accuracy and temporal consistency from videos remains challenging.
no code implementations • 24 Dec 2023 • Jinchao Zhu, Yuxuan Wang, Xiaobing Tu, Siyuan Pan, Pengfei Wan, Gao Huang
The Stable Diffusion Model (SDM) is a popular and efficient text-to-image (t2i) generation and image-to-image (i2i) generation model.
no code implementations • 27 Dec 2023 • Xun Guo, Mingwu Zheng, Liang Hou, Yuan Gao, Yufan Deng, Pengfei Wan, Di Zhang, Yufan Liu, Weiming Hu, ZhengJun Zha, Haibin Huang, Chongyang Ma
I2V-Adapter adeptly propagates the unnoised input image to subsequent noised frames through a cross-frame attention mechanism, maintaining the identity of the input image without any changes to the pretrained T2V model.
no code implementations • 5 Feb 2024 • Shiyuan Yang, Liang Hou, Haibin Huang, Chongyang Ma, Pengfei Wan, Di Zhang, Xiaodong Chen, Jing Liao
In practice, users often desire the ability to control object motion and camera movement independently for customized video creation.
no code implementations • 6 Feb 2024 • Haotian Yang, Mingwu Zheng, Chongyang Ma, Yu-Kun Lai, Pengfei Wan, Haibin Huang
In this paper, we introduce the Volumetric Relightable Morphable Model (VRMM), a novel volumetric and parametric facial prior for 3D face modeling.
no code implementations • 29 Mar 2024 • Luozhou Wang, Guibao Shen, Yixun Liang, Xin Tao, Pengfei Wan, Di Zhang, Yijun Li, Yingcong Chen
In this research, we present a novel approach to motion customization in video generation, addressing the widespread gap in the thorough exploration of motion representation within video generative models.
no code implementations • 15 Apr 2024 • Zhaokun Zhou, Qiulin Wang, Bin Lin, Yiwei Su, Rui Chen, Xin Tao, Amin Zheng, Li Yuan, Pengfei Wan, Di Zhang
To further evaluate the IAA capability of MLLMs, we construct the UNIAA-Bench, which consists of three aesthetic levels: Perception, Description, and Assessment.
1 code implementation • NeurIPS 2023 • Liang Hou, Qi Cao, Yige Yuan, Songtao Zhao, Chongyang Ma, Siyuan Pan, Pengfei Wan, Zhongyuan Wang, HuaWei Shen, Xueqi Cheng
Training generative adversarial networks (GANs) with limited data is challenging because the discriminator is prone to overfitting.
1 code implementation • CVPR 2021 • Xin Wen, Zhizhong Han, Yan-Pei Cao, Pengfei Wan, Wen Zheng, Yu-Shen Liu
We provide a comprehensive evaluation in experiments, which shows that our model with the learned bidirectional geometry correspondence outperforms state-of-the-art unpaired completion methods.
1 code implementation • CVPR 2021 • Xin Wen, Peng Xiang, Zhizhong Han, Yan-Pei Cao, Pengfei Wan, Wen Zheng, Yu-Shen Liu
As a result, the network learns a strict and unique correspondence on point-level, which can capture the detailed topology and structure relationships between the incomplete shape and the complete target, and thus improves the quality of the predicted complete shape.
1 code implementation • 19 Feb 2022 • Xin Wen, Peng Xiang, Zhizhong Han, Yan-Pei Cao, Pengfei Wan, Wen Zheng, Yu-Shen Liu
It moves each point of incomplete input to obtain a complete point cloud, where total distance of point moving paths (PMPs) should be the shortest.
Ranked #1 on Point Cloud Completion on Completion3D
1 code implementation • 15 Feb 2022 • Baixu Chen, Junguang Jiang, Ximei Wang, Pengfei Wan, Jianmin Wang, Mingsheng Long
Yet these datasets are time-consuming and labor-exhaustive to obtain on realistic tasks.
1 code implementation • 27 Nov 2023 • Qi Fan, Xin Tao, Lei Ke, Mingqiao Ye, Yuan Zhang, Pengfei Wan, Zhongyuan Wang, Yu-Wing Tai, Chi-Keung Tang
Thus, our solution, termed Stable-SAM, offers several advantages: 1) improved SAM's segmentation stability across a wide range of prompt qualities, while 2) retaining SAM's powerful promptable segmentation efficiency and generality, with 3) minimal learnable parameters (0. 08 M) and fast adaptation (by 1 training epoch).
1 code implementation • 20 Dec 2023 • Tao Zhang, Xingye Tian, Yikang Zhou, Shunping Ji, Xuebo Wang, Xin Tao, Yuan Zhang, Pengfei Wan, Zhongyuan Wang, Yu Wu
We present the \textbf{D}ecoupled \textbf{VI}deo \textbf{S}egmentation (DVIS) framework, a novel approach for the challenging task of universal video segmentation, including video instance segmentation (VIS), video semantic segmentation (VSS), and video panoptic segmentation (VPS).
Ranked #1 on Video Semantic Segmentation on VSPW
1 code implementation • ICCV 2023 • Tao Zhang, Xingye Tian, Yu Wu, Shunping Ji, Xuebo Wang, Yuan Zhang, Pengfei Wan
The efficacy of the decoupling strategy relies on two crucial elements: 1) attaining precise long-term alignment outcomes via frame-by-frame association during tracking, and 2) the effective utilization of temporal information predicated on the aforementioned accurate alignment outcomes during refinement.
Ranked #3 on Video Panoptic Segmentation on VIPSeg
1 code implementation • 7 Jun 2023 • Tao Zhang, Xingye Tian, Haoran Wei, Yu Wu, Shunping Ji, Xuebo Wang, Xin Tao, Yuan Zhang, Pengfei Wan
In this report, we successfully validated the effectiveness of the decoupling strategy in video panoptic segmentation.
1 code implementation • 28 Aug 2023 • Tao Zhang, Xingye Tian, Yikang Zhou, Yu Wu, Shunping Ji, Cilin Yan, Xuebo Wang, Xin Tao, Yuan Zhang, Pengfei Wan
Video instance segmentation is a challenging task that serves as the cornerstone of numerous downstream applications, including video editing and autonomous driving.
1 code implementation • 18 Feb 2022 • Peng Xiang, Xin Wen, Yu-Shen Liu, Yan-Pei Cao, Pengfei Wan, Wen Zheng, Zhizhong Han
Our insight into the detailed geometry is to introduce a skip-transformer in the SPD to learn the point splitting patterns that can best fit the local regions.
Ranked #5 on Point Cloud Completion on ShapeNet
1 code implementation • CVPR 2021 • Xingyu Chen, Yufeng Liu, Chongyang Ma, Jianlong Chang, Huayan Wang, Tian Chen, Xiaoyan Guo, Pengfei Wan, Wen Zheng
In the root-relative mesh recovery task, we exploit semantic relations among joints to generate a 3D mesh from the extracted 2D cues.
3 code implementations • NeurIPS 2021 • Mingcong Liu, Qiang Li, Zekui Qin, Guoxin Zhang, Pengfei Wan, Wen Zheng
Specifically, we first train a self-supervised style encoder on the generic artistic dataset to extract the representations of arbitrary styles.
2 code implementations • ICCV 2021 • Peng Xiang, Xin Wen, Yu-Shen Liu, Yan-Pei Cao, Pengfei Wan, Wen Zheng, Zhizhong Han
However, previous methods usually suffered from discrete nature of point cloud and unstructured prediction of points in local regions, which makes it hard to reveal fine local geometric details on the complete shape.