TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Monocular 3D Human Pose Estimation	Human3.6M	PoseFormer (T=81)	Average MPJPE (mm)	44.3	# 13
Monocular 3D Human Pose Estimation	Human3.6M	PoseFormer (T=81)	Frames Needed	81	# 29
Monocular 3D Human Pose Estimation	Human3.6M	PoseFormer (T=81)	2D detector	CPN	# 1
3D Human Pose Estimation	Human3.6M	PoseFormer (f=81, GT)	Average MPJPE (mm)	31.3	# 40
3D Human Pose Estimation	Human3.6M	PoseFormer (f=81, GT)	Using 2D ground-truth joints	Yes	# 2
3D Human Pose Estimation	Human3.6M	PoseFormer (f=81, GT)	Multi-View or Monocular	Monocular	# 1
3D Human Pose Estimation	Human3.6M	PoseFormer (f=81)	Average MPJPE (mm)	44.3	# 99
3D Human Pose Estimation	Human3.6M	PoseFormer (f=81)	Using 2D ground-truth joints	No	# 2
3D Human Pose Estimation	Human3.6M	PoseFormer (f=81)	Multi-View or Monocular	Monocular	# 1
3D Human Pose Estimation	MPI-INF-3DHP	PoseFormer (9 frames)	AUC	56.4	# 29
3D Human Pose Estimation	MPI-INF-3DHP	PoseFormer (9 frames)	MPJPE	77.1	# 33
3D Human Pose Estimation	MPI-INF-3DHP	PoseFormer (9 frames)	PCK	88.6	# 31

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/3d-human-pose-estimation-with-spatial-and/monocular-3d-human-pose-estimation-on-human3)](https://paperswithcode.com/sota/monocular-3d-human-pose-estimation-on-human3?p=3d-human-pose-estimation-with-spatial-and)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/3d-human-pose-estimation-with-spatial-and/3d-human-pose-estimation-on-mpi-inf-3dhp)](https://paperswithcode.com/sota/3d-human-pose-estimation-on-mpi-inf-3dhp?p=3d-human-pose-estimation-with-spatial-and)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/3d-human-pose-estimation-with-spatial-and/3d-human-pose-estimation-on-human36m)](https://paperswithcode.com/sota/3d-human-pose-estimation-on-human36m?p=3d-human-pose-estimation-with-spatial-and)`

3D Human Pose Estimation with Spatial and Temporal Transformers

ICCV 2021 · Ce Zheng, Sijie Zhu, Matias Mendieta, Taojiannan Yang, Chen Chen, Zhengming Ding ·

Transformer architectures have become the model of choice in natural language processing and are now being introduced into computer vision tasks such as image classification, object detection, and semantic segmentation. However, in the field of human pose estimation, convolutional architectures still remain dominant. In this work, we present PoseFormer, a purely transformer-based approach for 3D human pose estimation in videos without convolutional architectures involved. Inspired by recent developments in vision transformers, we design a spatial-temporal transformer structure to comprehensively model the human joint relations within each frame as well as the temporal correlations across frames, then output an accurate 3D human pose of the center frame. We quantitatively and qualitatively evaluate our method on two popular and standard benchmark datasets: Human3.6M and MPI-INF-3DHP. Extensive experiments show that PoseFormer achieves state-of-the-art performance on both datasets. Code is available at \url{https://github.com/zczcwh/PoseFormer}

PDF Abstract ICCV 2021 PDF ICCV 2021 Abstract

Code

Add Remove Mark official

zczcwh/PoseFormer official

470

zczcwh/DL-HPE

445

thuxyz19/test

Tasks

Add Remove

3D Human Pose Estimation

Image Classification

Monocular 3D Human Pose Estimation

object-detection

Object Detection

Pose Estimation

Semantic Segmentation

Datasets

Human3.6M

MPI-INF-3DHP

Results from the Paper

Edit

Ranked #13 on Monocular 3D Human Pose Estimation on Human3.6M

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Monocular 3D Human Pose Estimation	Human3.6M	PoseFormer (T=81)	Average MPJPE (mm)	44.3	# 13	Compare
			Frames Needed	81	# 29	Compare
			2D detector	CPN	# 1	Compare
3D Human Pose Estimation	Human3.6M	PoseFormer (f=81, GT)	Average MPJPE (mm)	31.3	# 40	Compare
			Using 2D ground-truth joints	Yes	# 2	Compare
			Multi-View or Monocular	Monocular	# 1	Compare
3D Human Pose Estimation	Human3.6M	PoseFormer (f=81)	Average MPJPE (mm)	44.3	# 99	Compare
			Using 2D ground-truth joints	No	# 2	Compare
			Multi-View or Monocular	Monocular	# 1	Compare
3D Human Pose Estimation	MPI-INF-3DHP	PoseFormer (9 frames)	AUC	56.4	# 29	Compare
			MPJPE	77.1	# 33	Compare
			PCK	88.6	# 31	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

3D Human Pose Estimation with Spatial and Temporal Transformers

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove