no code implementations • 20 Nov 2024 • Jingyu Zhuang, Di Kang, Linchao Bao, Liang Lin, Guanbin Li
Text-driven avatar generation has gained significant attention owing to its convenience.
1 code implementation • 4 Sep 2024 • Jiaxin Guo, Jiangliu Wang, Ruofeng Wei, Di Kang, Qi Dou, Yun-hui Liu
In neural rendering, we design a base-adaptive NeRF network to exploit the uncertainty estimation for explicitly handling the photometric inconsistencies.
no code implementations • 26 Aug 2024 • Xu He, Xiaoyu Li, Di Kang, Jiangnan Ye, Chaopeng Zhang, Liyang Chen, Xiangjun Gao, Han Zhang, Zhiyong Wu, Haolin Zhuang
Existing works in single-image human reconstruction suffer from weak generalizability due to insufficient training data or 3D inconsistencies for a lack of comprehensive multi-view knowledge.
no code implementations • 14 Jul 2024 • Jing Li, Di Kang, Zhenyu He
Deep learning-based multi-view facial capture methods have shown impressive accuracy while being several orders of magnitude faster than a traditional mesh registration pipeline.
1 code implementation • 3 Jul 2024 • Jiaxin Guo, Jiangliu Wang, Di Kang, Wenzhen Dong, Wenting Wang, Yun-hui Liu
To tackle this problem, in this paper, we propose the first SfM-free 3DGS-based method for surgical scene reconstruction by jointly optimizing the camera poses and scene representation.
no code implementations • 29 Apr 2024 • Cong Wang, Di Kang, He-Yi Sun, Shen-Han Qian, Zi-Xuan Wang, Linchao Bao, Song-Hai Zhang
In this paper, we propose a Hybrid Mesh-Gaussian Head Avatar (MeGA) that models different head components with more suitable representations.
no code implementations • 31 Jan 2024 • Xiaoyu Li, Qi Zhang, Di Kang, Weihao Cheng, Yiming Gao, Jingbo Zhang, Zhihao Liang, Jing Liao, Yan-Pei Cao, Ying Shan
In this survey, we aim to introduce the fundamental methodologies of 3D generation methods and establish a structured roadmap, encompassing 3D representation, generation methods, datasets, and corresponding applications.
no code implementations • 26 Jan 2024 • Jingyu Zhuang, Di Kang, Yan-Pei Cao, Guanbin Li, Liang Lin, Ying Shan
To this end, we propose a 3D scene editing framework, TIPEditor, that accepts both text and image prompts and a 3D bounding box to specify the editing region.
no code implementations • ICCV 2023 • Zhisheng Huang, Yujin Chen, Di Kang, Jinlu Zhang, Zhigang Tu
We propose PHRIT, a novel approach for parametric hand mesh modeling with an implicit template that combines the advantages of both parametric meshes and implicit representations.
no code implementations • 11 Jul 2023 • Cong Wang, Di Kang, Yan-Pei Cao, Linchao Bao, Ying Shan, Song-Hai Zhang
Rendering photorealistic and dynamically moving human heads is crucial for ensuring a pleasant and immersive experience in AR/VR and video conferencing applications.
1 code implementation • CVPR 2023 • Jiaxu Zhang, Junwu Weng, Di Kang, Fang Zhao, Shaoli Huang, Xuefei Zhe, Linchao Bao, Ying Shan, Jue Wang, Zhigang Tu
Driven by our explored distance-based losses that explicitly model the motion semantics and geometry, these two modules can learn residual motion modifications on the source motion to generate plausible retargeted motion in a single inference without post-processing.
no code implementations • 15 Mar 2023 • Ye Huang, Di Kang, Shenghua Gao, Wen Li, Lixin Duan
One crucial design of the HFG is to protect the high-level features from being contaminated by using proper stop-gradient operations so that the backbone does not update according to the noisy gradient from the upsampler.
no code implementations • ICCV 2023 • Zhangyang Xiong, Di Kang, Derong Jin, Weikai Chen, Linchao Bao, Shuguang Cui, Xiaoguang Han
Specifically, we bridge the latent space of Get3DHuman with that of StyleGAN-Human via a specially-designed prior network, where the input latent code is mapped to the shape and texture feature volumes spanned by the pixel-aligned 3D reconstructor.
no code implementations • 17 Jan 2023 • Jing Li, Di Kang, Wenjie Pei, Xuefei Zhe, Ying Zhang, Linchao Bao, Zhenyu He
Finally, we demonstrate that our method can be readily used to generate motion sequences with user-specified motion clips on the timeline.
no code implementations • 15 Jan 2023 • Linchao Bao, Haoxian Zhang, Yue Qian, Tangli Xue, Changhai Chen, Xuefei Zhe, Di Kang
We show that the predicted viseme curves can be applied to different viseme-rigged characters to yield various personalized animations with realistic and natural facial motions.
1 code implementation • 11 Jan 2023 • Ye Huang, Di Kang, Liang Chen, Wenjing Jia, Xiangjian He, Lixin Duan, Xuefei Zhe, Linchao Bao
Extensive experiments and ablation studies conducted on multiple benchmark datasets demonstrate that the proposed CAR can boost the accuracy of all baseline models by up to 2. 23% mIOU with superior generalization ability.
1 code implementation • CVPR 2023 • Haoran Bai, Di Kang, Haoxian Zhang, Jinshan Pan, Linchao Bao
Our pipeline utilizes the recent advances in StyleGAN-based facial image editing approaches to generate multi-view normalized face images from single-image inputs.
Ranked #3 on 3D Face Reconstruction on REALY
no code implementations • 27 Sep 2022 • Weiqiang Wang, Xuefei Zhe, Qiuhong Ke, Di Kang, Tingguang Li, Ruizhi Chen, Linchao Bao
Along with the novel system, we also present a new dataset dedicated to the multi-action motion synthesis task, which contains both action tags and their contextual information.
1 code implementation • 25 Aug 2022 • Yicheng Luo, Jing Ren, Xuefei Zhe, Di Kang, Yajing Xu, Peter Wonka, Linchao Bao
The network takes a line cloud as input , i. e., a nonstructural and unordered set of 3D line segments extracted from multi-view images, and outputs a 3D wireframe of the underlying building, which consists of a sparse set of 3D junctions connected by line segments.
no code implementations • 23 Aug 2022 • Zhangyang Xiong, Dong Du, Yushuang Wu, Jingqi Dong, Di Kang, Linchao Bao, Xiaoguang Han
On synthetic data, our Intersection-Over-Union (IoU) achieves to 93. 5%, 18% higher compared with PIFuHD.
no code implementations • 14 Jun 2022 • Runsong Zhu, Di Kang, Ka-Hei Hui, Yue Qian, Xuefei Zhe, Zhen Dong, Linchao Bao, Pheng-Ann Heng, Chi-Wing Fu
To guide the network quickly fit the coarse shape, we propose to utilize the signed supervision in regions that are obviously outside the object and can be easily determined, resulting in our semi-signed supervision.
1 code implementation • 18 Mar 2022 • Zenghao Chai, Haoxian Zhang, Jing Ren, Di Kang, Zhengzhuo Xu, Xuefei Zhe, Chun Yuan, Linchao Bao
The evaluation of 3D face reconstruction results typically relies on a rigid shape alignment between the estimated 3D model and the ground-truth scan.
1 code implementation • arXiv:2203.07160 2022 • Ye Huang, Di Kang, Liang Chen, Xuefei Zhe, Wenjing Jia, Xiangjian He, Linchao Bao
Recent segmentation methods, such as OCR and CPNet, utilizing "class level" information in addition to pixel features, have achieved notable success for boosting the accuracy of existing network modules.
Ranked #8 on Semantic Segmentation on PASCAL Context
no code implementations • 24 Jan 2022 • Zhigang Tu, Zhisheng Huang, Yujin Chen, Di Kang, Linchao Bao, Bisheng Yang, Junsong Yuan
We present a method for reconstructing accurate and consistent 3D hands from a monocular video.
no code implementations • CVPR 2022 • Yuan-Chen Guo, Di Kang, Linchao Bao, Yu He, Song-Hai Zhang
Specifically, we propose to split a scene into transmitted and reflected components, and model the two components with separate neural radiance fields.
no code implementations • ICCV 2021 • Jing Li, Di Kang, Wenjie Pei, Xuefei Zhe, Ying Zhang, Zhenyu He, Linchao Bao
In order to overcome this problem, we propose a novel conditional variational autoencoder (VAE) that explicitly models one-to-many audio-to-motion mapping by splitting the cross-modal latent code into shared code and motion-specific code.
Ranked #3 on Gesture Generation on BEAT
1 code implementation • 25 Jun 2021 • Jianchuan Chen, Ying Zhang, Di Kang, Xuefei Zhe, Linchao Bao, Xu Jia, Huchuan Lu
We present animatable neural radiance fields (animatable NeRF) for detailed human avatar creation from monocular videos.
1 code implementation • CVPR 2021 • Yujin Chen, Zhigang Tu, Di Kang, Linchao Bao, Ying Zhang, Xuefei Zhe, Ruizhi Chen, Junsong Yuan
For the first time, we demonstrate the feasibility of training an accurate 3D hand reconstruction network without relying on manual annotations.
Ranked #8 on 3D Hand Pose Estimation on HO-3D v3
1 code implementation • 19 Jan 2021 • Ye Huang, Di Kang, Wenjing Jia, Xiangjian He, Liu Liu
Spatial and channel attentions, modelling the semantic interdependencies in spatial and channel dimensions respectively, have recently been widely used for semantic segmentation.
Ranked #6 on Semantic Segmentation on COCO-Stuff test
1 code implementation • 12 Oct 2020 • Linchao Bao, Xiangkai Lin, Yajing Chen, Haoxian Zhang, Sheng Wang, Xuefei Zhe, Di Kang, HaoZhi Huang, Xinwei Jiang, Jue Wang, Dong Yu, Zhengyou Zhang
We present a fully automatic system that can produce high-fidelity, photo-realistic 3D digital human heads with a consumer RGB-D selfie camera.
no code implementations • 28 Jun 2020 • Yujin Chen, Zhigang Tu, Di Kang, Ruizhi Chen, Linchao Bao, Zhengyou Zhang, Junsong Yuan
In this work, we propose to consider hand and object jointly in feature space and explore the reciprocity of the two branches.
no code implementations • CVPR 2018 • Weihong Ren, Di Kang, Yandong Tang, Antoni B. Chan
While people tracking has been greatly improved over the recent years, crowd scenes remain particularly challenging for people tracking due to heavy occlusions, high crowd density, and significant appearance variation.
no code implementations • 16 May 2018 • Di Kang, Antoni Chan
In this paper, in contrast to using filters of different sizes, we utilize an image pyramid to deal with scale variations.
no code implementations • NeurIPS 2017 • Di Kang, Debarun Dhar, Antoni Chan
For example, for crowd counting, the camera perspective (e. g., camera angle and height) gives a clue about the appearance and scale of people in the scene.
1 code implementation • 29 May 2017 • Di Kang, Zheng Ma, Antoni B. Chan
The goal of this paper is to evaluate density maps generated by density estimation methods on a variety of crowd analysis tasks, including counting, detection, and tracking.
no code implementations • 21 Nov 2016 • Di Kang, Debarun Dhar, Antoni B. Chan
In order to incorporate the available side information, we propose an adaptive convolutional neural network (ACNN), where the convolutional filter weights adapt to the current scene context via the side information.