Human Parsing
56 papers with code • 1 benchmarks • 2 datasets
Human parsing is the task of segmenting a human image into different fine-grained semantic parts such as head, torso, arms and legs.
( Image credit: Multi-Human-Parsing (MHP) )
Latest papers
DROP: Decouple Re-Identification and Human Parsing with Task-specific Features for Occluded Person Re-identification
Unlike mainstream approaches using global features for simultaneous multi-task learning of ReID and human parsing, or relying on semantic information for attention guidance, DROP argues that the inferior performance of the former is due to distinct granularity requirements for ReID and human parsing features.
Explore Human Parsing Modality for Action Recognition
Multimodal-based action recognition methods have achieved high success using pose and RGB modality.
Part Representation Learning with Teacher-Student Decoder for Occluded Person Re-identification
In addition, existing occluded person ReID benchmarks utilize occluded samples as queries, which will amplify the role of alleviating occlusion interference and underestimate the impact of the feature absence issue.
UniParser: Multi-Human Parsing with Unified Correlation Representation Learning
Multi-human parsing is an image segmentation task necessitating both instance-level and fine-grained category-level information.
Parsing is All You Need for Accurate Gait Recognition in the Wild
Furthermore, due to the lack of suitable datasets, we build the first parsing-based dataset for gait recognition in the wild, named Gait3D-Parsing, by extending the large-scale and challenging Gait3D dataset.
DM-VTON: Distilled Mobile Real-time Virtual Try-On
Additionally, we propose Virtual Try-on-guided Pose for Data Synthesis to address the limited pose variation observed in training images.
Integrating Human Parsing and Pose Network for Human Action Recognition
We propose an Integrating Human Parsing and Pose Network (IPP-Net) for action recognition, which is the first to leverage both skeletons and human parsing feature maps in dual-branch approach.
Single-stage Multi-human Parsing via Point Sets and Center-based Offsets
We instead present a high-performance Single-stage Multi-human Parsing (SMP) deep architecture that decouples the multi-human parsing problem into two fine-grained sub-problems, i. e., locating the human body and parts.
Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks
Unlike the existing self-supervised learning methods, prior knowledge from human images is utilized in SOLIDER to build pseudo semantic labels and import more semantic information into the learned representation.
HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining
Specifically, we propose a \textbf{HumanBench} based on existing datasets to comprehensively evaluate on the common ground the generalization abilities of different pretraining methods on 19 datasets from 6 diverse downstream tasks, including person ReID, pose estimation, human parsing, pedestrian attribute recognition, pedestrian detection, and crowd counting.