TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Pedestrian Detection	CityPersons	SOLIDER	Reasonable MR^-2	9.7	# 9
Pedestrian Detection	CityPersons	SOLIDER	Heavy MR^-2	39.4	# 7
Person Search	CUHK-SYSU	SOLIDER	MAP	95.5	# 4
Person Search	CUHK-SYSU	SOLIDER	Top-1	95.8	# 6
Semantic Segmentation	LIP val	SOLIDER	mIoU	60.50%	# 4
Person Re-Identification	Market-1501	SOLIDER	Rank-1	96.9	# 5
Person Re-Identification	Market-1501	SOLIDER	mAP	93.9	# 19
Person Re-Identification	Market-1501	SOLIDER (RK)	Rank-1	96.7	# 9
Person Re-Identification	Market-1501	SOLIDER (RK)	mAP	95.6	# 2
Pose Estimation	MS COCO	SOLIDER (swin-B)	AP	76.6	# 8
Pose Estimation	MS COCO	SOLIDER (swin-B)	AR	81.5	# 3
Person Re-Identification	MSMT17	SOLIDER (with re-ranking)	Rank-1	91.7	# 1
Person Re-Identification	MSMT17	SOLIDER (with re-ranking)	mAP	86.5	# 2
Person Re-Identification	MSMT17	SOLIDER (without re-ranking)	Rank-1	90.7	# 3
Person Re-Identification	MSMT17	SOLIDER (without re-ranking)	mAP	77.1	# 5
Pedestrian Attribute Recognition	PA-100K	SOLIDER	Accuracy	86.38	# 5
Person Search	PRW	SOLIDER	mAP	59.8	# 1
Person Search	PRW	SOLIDER	Top-1	86.7	# 5

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/beyond-appearance-a-semantic-controllable/person-search-on-prw)](https://paperswithcode.com/sota/person-search-on-prw?p=beyond-appearance-a-semantic-controllable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/beyond-appearance-a-semantic-controllable/person-re-identification-on-market-1501)](https://paperswithcode.com/sota/person-re-identification-on-market-1501?p=beyond-appearance-a-semantic-controllable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/beyond-appearance-a-semantic-controllable/person-re-identification-on-msmt17)](https://paperswithcode.com/sota/person-re-identification-on-msmt17?p=beyond-appearance-a-semantic-controllable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/beyond-appearance-a-semantic-controllable/person-search-on-cuhk-sysu)](https://paperswithcode.com/sota/person-search-on-cuhk-sysu?p=beyond-appearance-a-semantic-controllable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/beyond-appearance-a-semantic-controllable/semantic-segmentation-on-lip-val)](https://paperswithcode.com/sota/semantic-segmentation-on-lip-val?p=beyond-appearance-a-semantic-controllable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/beyond-appearance-a-semantic-controllable/pedestrian-attribute-recognition-on-pa-100k)](https://paperswithcode.com/sota/pedestrian-attribute-recognition-on-pa-100k?p=beyond-appearance-a-semantic-controllable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/beyond-appearance-a-semantic-controllable/pose-estimation-on-coco)](https://paperswithcode.com/sota/pose-estimation-on-coco?p=beyond-appearance-a-semantic-controllable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/beyond-appearance-a-semantic-controllable/pedestrian-detection-on-citypersons)](https://paperswithcode.com/sota/pedestrian-detection-on-citypersons?p=beyond-appearance-a-semantic-controllable)`

Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks

CVPR 2023 · Weihua Chen, Xianzhe Xu, Jian Jia, Hao Luo, Yaohua Wang, Fan Wang, Rong Jin, Xiuyu Sun ·

Human-centric visual tasks have attracted increasing research attention due to their widespread applications. In this paper, we aim to learn a general human representation from massive unlabeled human images which can benefit downstream human-centric tasks to the maximum extent. We call this method SOLIDER, a Semantic cOntrollable seLf-supervIseD lEaRning framework. Unlike the existing self-supervised learning methods, prior knowledge from human images is utilized in SOLIDER to build pseudo semantic labels and import more semantic information into the learned representation. Meanwhile, we note that different downstream tasks always require different ratios of semantic information and appearance information. For example, human parsing requires more semantic information, while person re-identification needs more appearance information for identification purpose. So a single learned representation cannot fit for all requirements. To solve this problem, SOLIDER introduces a conditional network with a semantic controller. After the model is trained, users can send values to the controller to produce representations with different ratios of semantic information, which can fit different needs of downstream tasks. Finally, SOLIDER is verified on six downstream human-centric visual tasks. It outperforms state of the arts and builds new baselines for these tasks. The code is released in https://github.com/tinyvision/SOLIDER.

PDF Abstract CVPR 2023 PDF CVPR 2023 Abstract

Code

Add Remove Mark official

tinyvision/SOLIDER official

1,863

modelscope/modelscope

6,055

hasanirtiza/Pedestron

677

DengpanFu/LUPerson

217

Tasks

Add Remove

Human Parsing

Pedestrian Attribute Recognition

Pedestrian Detection

Person Re-Identification

Person Search

Pose Estimation

Self-Supervised Learning

Semantic Segmentation

Datasets

MS COCO

Market-1501 MSMT17

CityPersons

CUHK-SYSU

PRW

LIP

PA-100K

Results from the Paper

Edit

Ranked #1 on Person Search on PRW

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Pedestrian Detection	CityPersons	SOLIDER	Reasonable MR^-2	9.7	# 9	Compare
Pedestrian Detection	CityPersons	SOLIDER	Heavy MR^-2	39.4	# 7	Compare
Person Search	CUHK-SYSU	SOLIDER	MAP	95.5	# 4	Compare
Person Search	CUHK-SYSU	SOLIDER	Top-1	95.8	# 6	Compare
Semantic Segmentation	LIP val	SOLIDER	mIoU	60.50%	# 4	Compare
Person Re-Identification	Market-1501	SOLIDER	Rank-1	96.9	# 5	Compare
Person Re-Identification	Market-1501	SOLIDER	mAP	93.9	# 19	Compare
Person Re-Identification	Market-1501	SOLIDER (RK)	Rank-1	96.7	# 9	Compare
Person Re-Identification	Market-1501	SOLIDER (RK)	mAP	95.6	# 2	Compare
Pose Estimation	MS COCO	SOLIDER (swin-B)	AP	76.6	# 8	Compare
Pose Estimation	MS COCO	SOLIDER (swin-B)	AR	81.5	# 3	Compare
Person Re-Identification	MSMT17	SOLIDER (with re-ranking)	Rank-1	91.7	# 1	Compare
Person Re-Identification	MSMT17	SOLIDER (with re-ranking)	mAP	86.5	# 2	Compare
Person Re-Identification	MSMT17	SOLIDER (without re-ranking)	Rank-1	90.7	# 3	Compare
Person Re-Identification	MSMT17	SOLIDER (without re-ranking)	mAP	77.1	# 5	Compare
Pedestrian Attribute Recognition	PA-100K	SOLIDER	Accuracy	86.38	# 5	Compare
Person Search	PRW	SOLIDER	mAP	59.8	# 1	Compare
Person Search	PRW	SOLIDER	Top-1	86.7	# 5	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove