1 code implementation • 6 Jul 2024 • Ye Li, Chen Tang, Yuan Meng, Jiajun Fan, Zenghao Chai, Xinzhu Ma, Zhi Wang, Wenwu Zhu
We introduce PRANCE, a Vision Transformer compression framework that jointly optimizes the activated channels and reduces tokens, based on the characteristics of inputs.
1 code implementation • 7 Jun 2024 • Zenghao Chai, Chen Tang, Yongkang Wong, Mohan Kankanhalli
The creation of 4D avatars (i. e., animated 3D avatars) from text description typically uses text-to-image (T2I) diffusion models to synthesize 3D avatars in the canonical space and subsequently applies animation with target motions.
no code implementations • 5 May 2023 • Zhengzhuo Xu, Zenghao Chai, Chengyin Xu, Chun Yuan, Haiqin Yang
In this paper, we observe that the knowledge transfer between experts is imbalanced in terms of class distribution, which results in limited performance improvement of the minority classes.
no code implementations • ICCV 2023 • Zenghao Chai, Tianke Zhang, Tianyu He, Xu Tan, Tadas Baltrušaitis, HsiangTao Wu, Runnan Li, Sheng Zhao, Chun Yuan, Jiang Bian
3D Morphable Models (3DMMs) demonstrate great potential for reconstructing faithful and animatable 3D facial surfaces from a single image.
Ranked #1 on
3D Face Reconstruction
on REALY (side-view)
no code implementations • 14 Feb 2023 • Chen Tang, Kai Ouyang, Zenghao Chai, Yunpeng Bai, Yuan Meng, Zhi Wang, Wenwu Zhu
This general and dataset-independent property makes us search for the MPQ policy over a rather small-scale proxy dataset and then the policy can be directly used to quantize the model trained on a large-scale dataset.
no code implementations • 30 Jan 2023 • Shengmeng Li, Luping Liu, Zenghao Chai, Runnan Li, Xu Tan
Different from the traditional predictor based on explicit Adams methods, we leverage a Lagrange interpolation function as the predictor, which is further enhanced with an error-robust strategy to adaptively select the Lagrange bases with lower error in the estimated noise.
1 code implementation • CVPR 2023 • Zhengzhuo Xu, Ruikang Liu, Shuo Yang, Zenghao Chai, Chun Yuan
In this paper, we systematically investigate the ViTs' performance in LTR and propose LiVT to train ViTs from scratch only with LT data.
Ranked #7 on
Long-tail Learning
on CIFAR-10-LT (ρ=10)
1 code implementation • 14 Aug 2022 • Chengyin Xu, Zenghao Chai, Zhengzhuo Xu, Chun Yuan, Yanbo Fan, Jue Wang
Image retrieval has become an increasingly appealing technique with broad multimedia application prospects, where deep hashing serves as the dominant branch towards low storage and efficient retrieval.
1 code implementation • 18 Mar 2022 • Zenghao Chai, Haoxian Zhang, Jing Ren, Di Kang, Zhengzhuo Xu, Xuefei Zhe, Chun Yuan, Linchao Bao
The evaluation of 3D face reconstruction results typically relies on a rigid shape alignment between the estimated 3D model and the ground-truth scan.
1 code implementation • 4 Dec 2021 • Chengyin Xu, Zenghao Chai, Zhengzhuo Xu, Hongjia Li, Qiruyi Zuo, Lingyu Yang, Chun Yuan
Deep hashing has shown promising performance in large-scale image retrieval.
1 code implementation • 2 Dec 2021 • Yunpeng Bai, Chao Dong, Zenghao Chai, Andong Wang, Zhengzhuo Xu, Chun Yuan
To address these two problems, we propose Semantic-Sparse Colorization Network (SSCN) to transfer both the global image style and detailed semantic-related colors to the gray-scale image in a coarse-to-fine manner.
1 code implementation • NeurIPS 2021 • Zhengzhuo Xu, Zenghao Chai, Chun Yuan
Real-world data universally confronts a severe class-imbalance problem and exhibits a long-tailed distribution, i. e., most labels are associated with limited instances.
Ranked #24 on
Long-tail Learning
on CIFAR-100-LT (ρ=10)
1 code implementation • 25 Oct 2021 • Zenghao Chai, Zhengzhuo Xu, Chun Yuan
We carefully design Detail Context Block (DCB) to extract fine-grained details and improve the isolated correlation between upper context state and current input state.
1 code implementation • 6 Feb 2021 • Zenghao Chai, Zhengzhuo Xu, Yunpeng Bai, Zhihui Lin, Chun Yuan
To tackle the increasing ambiguity during forecasting, we design CMS-LSTM to focus on context correlations and multi-scale spatiotemporal flow with details on fine-grained locals, containing two elaborate designed blocks: Context Embedding (CE) and Spatiotemporal Expression (SE) blocks.