Authentic Hand Avatar from a Phone Scan via Universal Hand Model

Gyeongsik Moon, Weipeng Xu, Rohan Joshi, Chenglei Wu, Takaaki Shiratori

In this paper, we present a universal hand model (UHM), which 1) can universally represent high-fidelity 3D hand meshes of arbitrary identities (IDs) and 2) can be adapted to each person with a short phone scan for the authentic hand avatar.

URHand: Universal Relightable Hands

Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, He Wen, Lucas Evans, Bo Peng, Julia Buffalini, Autumn Trimble, Kevyn McPhail, Melissa Schoeller, Shoou-I Yu, Javier Romero, Michael Zollhöfer, Yaser Sheikh, Ziwei Liu, Shunsuke Saito

To simplify the personalization process while retaining photorealism, we build a powerful universal relightable prior based on neural relighting from multi-view images of hands captured in a light stage with hundreds of identities.

Three Recipes for Better 3D Pseudo-GTs of 3D Human Mesh Estimation in the Wild

10 Apr 2023 Gyeongsik Moon, Hongsuk Choi, Sanghyuk Chun, Jiyoung Lee, Sangdoo Yun

Recovering 3D human mesh in the wild is greatly challenging as in-the-wild (ITW) datasets provide only 2D pose ground truths (GTs).

3D Multi-Person Pose Estimation

Recovering 3D Hand Mesh Sequence from a Single Blurry Image: A New Dataset and Temporal Unfolding

CVPR 2023 Yeonguk Oh, JoonKyu Park, Jaeha Kim, Gyeongsik Moon, Kyoung Mu Lee

In addition to the new dataset, we propose BlurHandNet, a baseline network for accurate 3D hand mesh recovery from a blurry hand image.

Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild

CVPR 2023 Gyeongsik Moon

Hence, interacting hands of MoCap datasets are brought to the 2D scale space of single hands of ITW datasets.

Rethinking Self-Supervised Visual Representation Learning in Pre-training for 3D Human Pose and Shape Estimation

9 Mar 2023 Hongsuk Choi, Hyeongjin Nam, Taeryung Lee, Gyeongsik Moon, Kyoung Mu Lee

Recently, a few self-supervised representation learning (SSL) methods have outperformed the ImageNet classification pre-training for vision tasks such as object detection.

MultiAct: Long-Term 3D Human Motion Generation from Multiple Action Labels

12 Dec 2022 Taeryung Lee, Gyeongsik Moon, Kyoung Mu Lee

The action-conditioned methods generate a sequence of motion from a single action.

MonoNHR: Monocular Neural Human Renderer

2 Oct 2022 Hongsuk Choi, Gyeongsik Moon, Matthieu Armando, Vincent Leroy, Kyoung Mu Lee, Gregory Rogez

Existing neural human rendering methods struggle with a single image input due to the lack of information in invisible areas and the depth ambiguity of pixels in visible areas.

3D Clothed Human Reconstruction in the Wild

20 Jul 2022 Gyeongsik Moon, Hyeongjin Nam, Takaaki Shiratori, Kyoung Mu Lee

Although much progress has been made in 3D clothed human reconstruction, most of the existing methods fail to produce robust results from in-the-wild images, which contain diverse human poses and appearances.

HandOccNet: Occlusion-Robust 3D Hand Mesh Estimation Network

CVPR 2022 JoonKyu Park, Yeonguk Oh, Gyeongsik Moon, Hongsuk Choi, Kyoung Mu Lee

However, we argue that occluded regions have strong correlations with hands so that they can provide highly beneficial information for complete 3D hand mesh estimation.

Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation

23 Nov 2020 Gyeongsik Moon, Hongsuk Choi, Kyoung Mu Lee

Using Pose2Pose, Hand4Whole utilizes hand MCP joint features to predict 3D wrists as MCP joints largely contribute to 3D wrist rotations in the human kinematic chain.

NeuralAnnot: Neural Annotator for 3D Human Mesh Training Sets

23 Nov 2020 Gyeongsik Moon, Hongsuk Choi, Kyoung Mu Lee

Assuming no 3D pseudo-GTs are available, NeuralAnnot is weakly supervised with GT 2D/3D joint coordinates of training sets.

Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose

ECCV 2020 Hongsuk Choi, Gyeongsik Moon, Kyoung Mu Lee

Most of the recent deep learning-based 3D human pose and mesh estimation methods regress the pose and shape parameters of human mesh models, such as SMPL and MANO, from an input image.

DeepHandMesh: A Weakly-supervised Deep Encoder-Decoder Framework for High-fidelity Hand Mesh Modeling

ECCV 2020 Gyeongsik Moon, Takaaki Shiratori, Kyoung Mu Lee

We design our system to be trained in an end-to-end and weakly-supervised manner; therefore, it does not require groundtruth meshes.


IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos

13 Jul 2020 Gyeongsik Moon, Heeseung Kwon, Kyoung Mu Lee, Minsu Cho

Most current action recognition methods heavily rely on appearance information by taking an RGB sequence of entire image regions as input.

PoseLifter: Absolute 3D human pose lifting network from a single noisy 2D human pose

26 Oct 2019 Ju Yong Chang, Gyeongsik Moon, Kyoung Mu Lee

This study presents a new network (i. e., PoseLifter) that can lift a 2D human pose to an absolute 3D pose in a camera coordinate system.

Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image

ICCV 2019 Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee

Although significant improvement has been achieved recently in 3D human pose estimation, most of the previous methods only treat a single-person case.

Multi-scale Aggregation R-CNN for 2D Multi-person Pose Estimation

10 May 2019 Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee

Multi-person pose estimation from a 2D image is challenging because it requires not only keypoint localization but also human detection.

PoseFix: Model-agnostic General Human Pose Refinement Network

CVPR 2019 Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee

In this paper, we propose a human pose refinement network that estimates a refined pose from a tuple of an input image and input pose.

V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map

CVPR 2018 Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee

To overcome these weaknesses, we firstly cast the 3D hand and human pose estimation problem from a single depth map into a voxel-to-voxel prediction that uses a 3D voxelized grid and estimates the per-voxel likelihood for each keypoint.

