HOI Analysis: Integrating and Decomposing Human-Object Interaction

Human-Object Interaction (HOI) consists of human, object and implicit interaction/verb. Different from previous methods that directly map pixels to HOI semantics, we propose a novel perspective for HOI learning in an analytical manner. In analogy to Harmonic Analysis, whose goal is to study how to represent the signals with the superposition of basic waves, we propose the HOI Analysis. We argue that coherent HOI can be decomposed into isolated human and object. Meanwhile, isolated human and object can also be integrated into coherent HOI again. Moreover, transformations between human-object pairs with the same HOI can also be easier approached with integration and decomposition. As a result, the implicit verb will be represented in the transformation function space. In light of this, we propose an Integration-Decomposition Network (IDN) to implement the above transformations and achieve state-of-the-art performance on widely-used HOI detection benchmarks. Code is available at https://github.com/DirtyHarryLYL/HAKE-Action-Torch/tree/IDN-(Integrating-Decomposing-Network).

PDF Abstract NeurIPS 2020 PDF NeurIPS 2020 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Human-Object Interaction Detection HICO-DET IDN (finetuned detector) mAP 26.29 # 31
Human-Object Interaction Detection HICO-DET IDN (COCO detector) mAP 23.36 # 37
Human-Object Interaction Detection V-COCO IDN AP(S1) 53.3 # 20
AP(S2) 60.3 # 17

Methods


No methods listed for this paper. Add relevant methods here