no code implementations • 27 Oct 2024 • Mingjiang Liang, Yongkang Cheng, Hualin Liang, Shaoli Huang, Wei Liu
This is achieved by exploiting the relationships with visible anatomical structures, enhancing the accuracy of local pose estimations.
no code implementations • 27 Oct 2024 • Yongkang Cheng, Mingjiang Liang, Shaoli Huang, Gaoge Han, Jifeng Ning, Wei Liu
This is because the denoising process of DDPM in the latter relies on the assumption that the noise added at each step is sampled from a unimodal distribution, and the noise values are small.
no code implementations • 12 Oct 2024 • Yongkang Cheng, Mingjiang Liang, Shaoli Huang, Jifeng Ning, Wei Liu
Existing gesture generation methods primarily focus on upper body gestures based on audio features, neglecting speech content, emotion, and locomotion.
no code implementations • 9 Oct 2024 • Gaoge Han, Mingjiang Liang, Jinglei Tang, Yongkang Cheng, Wei Liu, Shaoli Huang
In this paper, we present \emph{ReinDiffuse} that combines reinforcement learning with motion diffusion model to generate physically credible human motions that align with textual descriptions.
no code implementations • 15 Aug 2024 • Ce Chen, Shaoli Huang, Xuelin Chen, Guangyi Chen, Xiaoguang Han, Kun Zhang, Mingming Gong
The primary challenges of our mesh-based framework involve stably generating a mesh with details that align with the text prompt while directly driving it and maintaining surface continuity.
1 code implementation • 4 Jun 2024 • Yicheng Xiao, Lin Song, Shaoli Huang, Jiangshan Wang, Siyu Song, Yixiao Ge, Xiu Li, Ying Shan
The state space models, employing recursively propagated features, demonstrate strong representation capabilities comparable to Transformer models and superior efficiency.
no code implementations • CVPR 2024 • Hanchao Liu, Xiaohang Zhan, Shaoli Huang, Tai-Jiang Mu, Ying Shan
This problem is characterized by an open and fully customizable set of motion control tasks.
1 code implementation • 23 Apr 2024 • Rui Chen, Mingyi Shi, Shaoli Huang, Ping Tan, Taku Komura, Xuelin Chen
We present a novel character control framework that effectively utilizes motion diffusion probabilistic models to generate high-quality and diverse character animations, responding in real-time to a variety of dynamic user-supplied control signals.
no code implementations • 7 Jan 2024 • Sicheng Yang, Zunnan Xu, Haiwei Xue, Yongkang Cheng, Shaoli Huang, Mingming Gong, Zhiyong Wu
To tackle these issues, we introduce FreeTalker, which, to the best of our knowledge, is the first framework for the generation of both spontaneous (e. g., co-speech gesture) and non-spontaneous (e. g., moving around the podium) speaker motions.
no code implementations • 19 Dec 2023 • Gaoge Han, Shaoli Huang, Mingming Gong, Jinglei Tang
We introduce HuTuMotion, an innovative approach for generating natural human motions that navigates latent motion diffusion models by leveraging few-shot human feedback.
no code implementations • 18 Dec 2023 • Zeping Ren, Shaoli Huang, Xiu Li
Our method integrates 3D and 2D information using a shared transformer network within the training of the diffusion model, unifying motion noise into a single feature space.
no code implementations • 31 Oct 2023 • Zhengdi Yu, Shaoli Huang, Yongkang Cheng, Tolga Birdal
We present SignAvatars, the first large-scale, multi-prompt 3D sign language (SL) motion dataset designed to bridge the communication gap for Deaf and hard-of-hearing individuals.
no code implementations • 31 Oct 2023 • Xin He, Shaoli Huang, Xiaohang Zhan, Chao Weng, Ying Shan
Our framework comprises a Semantic Enhancement module and a Context-Attuned Motion Denoiser (CAMD).
no code implementations • 19 Oct 2023 • Jiaxu Zhang, Shaoli Huang, Zhigang Tu, Xin Chen, Xiaohang Zhan, Gang Yu, Ying Shan
In this work, we present TapMo, a Text-driven Animation Pipeline for synthesizing Motion in a broad spectrum of skeleton-free 3D characters.
1 code implementation • ICCV 2023 • YiHao Zhi, Xiaodong Cun, Xuelin Chen, Xi Shen, Wen Guo, Shaoli Huang, Shenghua Gao
While previous methods are able to generate speech rhythm-synchronized gestures, the semantic context of the speech is generally lacking in the gesticulations.
no code implementations • CVPR 2023 • Hao Tang, Songhua Liu, Tianwei Lin, Shaoli Huang, Fu Li, Dongliang He, Xinchao Wang
On the other hand, different from the vanilla version, we adopt a learnable scaling operation on content features before content-style feature interaction, which better preserves the original similarity between a pair of content features while ensuring the stylization quality.
no code implementations • CVPR 2023 • Fang Zhao, Zekun Li, Shaoli Huang, Junwu Weng, Tianfei Zhou, Guo-Sen Xie, Jue Wang, Ying Shan
Once the anchor transformations are found, per-vertex nonlinear displacements of the garment template can be regressed in a canonical space, which reduces the complexity of deformation space learning.
1 code implementation • 21 Mar 2023 • Yongkang Cheng, Shaoli Huang, Jifeng Ning, Ying Shan
This paper presents a novel approach for estimating human body shape and pose from monocular images that effectively addresses the challenges of occlusions and depth ambiguity.
Ranked #22 on
3D Human Pose Estimation
on 3DPW
no code implementations • 20 Mar 2023 • Haoyu Wang, Shaoli Huang, Fang Zhao, Chun Yuan, Ying Shan
We present a simple yet effective method for skeleton-free motion retargeting.
1 code implementation • CVPR 2023 • Jiaxu Zhang, Junwu Weng, Di Kang, Fang Zhao, Shaoli Huang, Xuefei Zhe, Linchao Bao, Ying Shan, Jue Wang, Zhigang Tu
Driven by our explored distance-based losses that explicitly model the motion semantics and geometry, these two modules can learn residual motion modifications on the source motion to generate plausible retargeted motion in a single inference without post-processing.
1 code implementation • CVPR 2023 • Zhengdi Yu, Shaoli Huang, Chen Fang, Toby P. Breckon, Jue Wang
Our method significantly outperforms the best interacting-hand approaches on the InterHand2. 6M dataset while yielding comparable performance with the state-of-the-art single-hand methods on the FreiHand dataset.
Ranked #2 on
3D Interacting Hand Pose Estimation
on InterHand2.6M
1 code implementation • 15 Jan 2023 • Jianrong Zhang, Yangsong Zhang, Xiaodong Cun, Shaoli Huang, Yong Zhang, Hongwei Zhao, Hongtao Lu, Xi Shen
Additionally, we conduct analyses on HumanML3D and observe that the dataset size is a limitation of our approach.
Ranked #2 on
Motion Synthesis
on Motion-X
1 code implementation • CVPR 2023 • Zhifeng Lin, Changxing Ding, Huan Yao, Zengsheng Kuang, Shaoli Huang
Notably, the performance of our model on hand pose estimation even surpasses that of existing works that only perform the single-hand pose estimation task.
Ranked #2 on
hand-object pose
on DexYCB
no code implementations • ICCV 2023 • Dan Liu, Jin Hou, Shaoli Huang, Jing Liu, Yuxin He, Bochuan Zheng, Jifeng Ning, Jingdong Zhang
To break the deadlock, we present LoTE-Animal, a large-scale endangered animal dataset collected over 12 years, to foster the application of deep learning in rare species conservation.
1 code implementation • 12 Jul 2022 • Xubin Zhong, Changxing Ding, Zijian Li, Shaoli Huang
Specifically, we shift the GT bounding boxes of each labeled human-object pair so that the shifted boxes cover only a certain portion of the GT ones.
no code implementations • 13 Mar 2022 • Ke Zhang, Jin Fan, Shaoli Huang, Yongliang Qiao, Xiaofeng Yu, Feiwei Qin
We innovatively propose a cross distillation module to provide additional supervision to alleviate the noise problem, and propose a collaborative ensemble module to overcome the target conflict problem.
1 code implementation • CVPR 2022 • Peng Du, Jifeng Ning, Jiguang Cui, Shaoli Huang, Xinchao Wang, Jiaxin Wang
Further, an optimized GES energy term is presented to reasonably determine the weights of the sampling points on the geometric structure, and the term is added into the Global Similarity Prior (GSP) stitching model called GES-GSP to achieve a smooth transition between local alignment and geometric structure preservation.
no code implementations • 21 Dec 2021 • Jue Wang, Shaoli Huang, Xinchao Wang, DaCheng Tao
Conventional 3D human pose estimation relies on first detecting 2D body keypoints and then solving the 2D to 3D correspondence problem. Despite the promising results, this learning paradigm is highly dependent on the quality of the 2D keypoint detector, which is inevitably fragile to occlusions and out-of-image absences. In this paper, we propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only, hence bypassing the error-prone keypoint detector in the absence of image evidence.
Ranked #84 on
3D Human Pose Estimation
on MPI-INF-3DHP
no code implementations • 16 Aug 2021 • Lianbo Zhang, Shaoli Huang, Xinchao Wang, Wei Liu, DaCheng Tao
In this paper, we introduce a novel structure-aware feature generation scheme, termed as SA-GAN, to explicitly account for the topological structure in learning both the latent space and the generative networks.
1 code implementation • ICCV 2021 • Shaoli Huang, Xinchao Wang, DaCheng Tao
Learning mid-level representation for fine-grained recognition is easily dominated by a limited number of highly discriminative patterns, degrading its robustness and generalization capability.
2 code implementations • 9 Dec 2020 • Shaoli Huang, Xinchao Wang, DaCheng Tao
As the main discriminative information of a fine-grained image usually resides in subtle regions, methods along this line are prone to heavy label noise in fine-grained recognition.
Ranked #36 on
Fine-Grained Image Classification
on CUB-200-2011
no code implementations • 28 Nov 2019 • Mingjiang Liang, Shaoli Huang, Shirui Pan, Mingming Gong, Wei Liu
Few-shot learning is currently enjoying a considerable resurgence of interest, aided by the recent advance of deep learning.
no code implementations • 28 Nov 2019 • Sanjeev Sharma, Shaoli Huang, DaCheng Tao
This work addresses the challenging problem of unconstrained 3D hand pose estimation using monocular RGB images.
no code implementations • 20 May 2019 • Jue Wang, Shaoli Huang, Xinchao Wang, DaCheng Tao
We model parts with higher DOFs like the elbows, as dependent components of the corresponding parts with lower DOFs like the torso, of which the 3D locations can be more reliably estimated.
no code implementations • ECCV 2018 • Zhe Chen, Shaoli Huang, DaCheng Tao
Current two-stage object detectors, which consists of a region proposal stage and a refinement stage, may produce unreliable results due to ill-localized proposed regions.
no code implementations • ICCV 2017 • Shaoli Huang, Mingming Gong, DaCheng Tao
To target this problem, we develop a keypoint localization network composed of several coarse detector branches, each of which is built on top of a feature layer in a CNN, and a fine detector branch built on top of multiple feature layers.
no code implementations • 4 Oct 2016 • Shaoli Huang, DaCheng Tao
The proposed architecture consists of a part localization network, a two-stream classification network that simultaneously encodes object-level and part-level cues, and a feature vectors fusion component.
no code implementations • CVPR 2016 • Shaoli Huang, Zhe Xu, DaCheng Tao, Ya zhang
In the context of fine-grained visual categorization, the ability to interpret models as human-understandable visual manuals is sometimes as important as achieving high classification accuracy.
Ranked #70 on
Fine-Grained Image Classification
on CUB-200-2011
no code implementations • ICCV 2015 • Zhe Xu, Shaoli Huang, Ya zhang, DaCheng Tao
We propose a new method for fine-grained object recognition that employs part-level annotations and deep convolutional neural networks (CNNs) in a unified framework.