TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
3D Human Pose Estimation	3DPW	3DCrowdNet	PA-MPJPE	55.8	# 83
3D Human Pose Estimation	3DPW	3DCrowdNet	MPJPE	85.8	# 77
3D Human Pose Estimation	3DPW	3DCrowdNet	MPVPE	108.5	# 64
3D Multi-Person Pose Estimation	MuPoTS-3D	3DCrowdNet (HigherHRNet)	3DPCK	72.7	# 10

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/3dcrowdnet-2d-human-pose-guided3d-crowd-human/3d-multi-person-human-pose-estimation-on)](https://paperswithcode.com/sota/3d-multi-person-human-pose-estimation-on?p=3dcrowdnet-2d-human-pose-guided3d-crowd-human)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/3dcrowdnet-2d-human-pose-guided3d-crowd-human/3d-human-pose-estimation-on-3dpw)](https://paperswithcode.com/sota/3d-human-pose-estimation-on-3dpw?p=3dcrowdnet-2d-human-pose-guided3d-crowd-human)`

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes

CVPR 2022 · Hongsuk Choi, Gyeongsik Moon, JoonKyu Park, Kyoung Mu Lee ·

We consider the problem of recovering a single person's 3D human mesh from in-the-wild crowded scenes. While much progress has been in 3D human mesh estimation, existing methods struggle when test input has crowded scenes. The first reason for the failure is a domain gap between training and testing data. A motion capture dataset, which provides accurate 3D labels for training, lacks crowd data and impedes a network from learning crowded scene-robust image features of a target person. The second reason is a feature processing that spatially averages the feature map of a localized bounding box containing multiple people. Averaging the whole feature map makes a target person's feature indistinguishable from others. We present 3DCrowdNet that firstly explicitly targets in-the-wild crowded scenes and estimates a robust 3D human mesh by addressing the above issues. First, we leverage 2D human pose estimation that does not require a motion capture dataset with 3D labels for training and does not suffer from the domain gap. Second, we propose a joint-based regressor that distinguishes a target person's feature from others. Our joint-based regressor preserves the spatial activation of a target by sampling features from the target's joint locations and regresses human model parameters. As a result, 3DCrowdNet learns target-focused features and effectively excludes the irrelevant features of nearby persons. We conduct experiments on various benchmarks and prove the robustness of 3DCrowdNet to the in-the-wild crowded scenes both quantitatively and qualitatively. The code is available at https://github.com/hongsukchoi/3DCrowdNet_RELEASE.

PDF Abstract CVPR 2022 PDF CVPR 2022 Abstract

Code

Add Remove Mark official

hongsukchoi/3dcrowdnet_release official

148

Tasks

Add Remove

2D Human Pose Estimation

3D Human Pose Estimation

3D Multi-Person Human Pose Estimation

3D Multi-Person Pose Estimation

Pose Estimation

Datasets

MS COCO

3DPW

CrowdPose

MuPoTS-3D

Results from the Paper

Edit

Ranked #10 on 3D Multi-Person Pose Estimation on MuPoTS-3D

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
3D Human Pose Estimation	3DPW	3DCrowdNet	PA-MPJPE	55.8	# 83	Compare
			MPJPE	85.8	# 77	Compare
			MPVPE	108.5	# 64	Compare
3D Multi-Person Pose Estimation	MuPoTS-3D	3DCrowdNet (HigherHRNet)	3DPCK	72.7	# 10	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove