Parsing R-CNN for Instance-Level Human Analysis

CVPR 2019  ·  Lu Yang, Qing Song, Zhihui Wang, Ming Jiang ·

Instance-level human analysis is common in real-life scenarios and has multiple manifestations, such as human part segmentation, dense pose estimation, human-object interactions, etc. Models need to distinguish different human instances in the image panel and learn rich features to represent the details of each instance. In this paper, we present an end-to-end pipeline for solving the instance-level human analysis, named Parsing R-CNN. It processes a set of human instances simultaneously through comprehensive considering the characteristics of region-based approach and the appearance of a human, thus allowing representing the details of instances. Parsing R-CNN is very flexible and efficient, which is applicable to many issues in human instance analysis. Our approach outperforms all state-of-the-art methods on CIHP (Crowd Instance-level Human Parsing), MHP v2.0 (Multi-Human Parsing) and DensePose-COCO datasets. Based on the proposed Parsing R-CNN, we reach the 1st place in the COCO 2018 Challenge DensePose Estimation task. Code and models are public available.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Human Part Segmentation CIHP Parsing R-CNN + ResNext101 Mean IoU 61.1 # 2
Pose Estimation DensePose-COCO Parsing R-CNN + ResNext101 AP 61.6 # 1
Human Part Segmentation MHP v2.0 Parsing R-CNN + ResNext101 Mean IoU 41.8 # 1