Human-Object Interaction Detection

132 papers with code • 6 benchmarks • 22 datasets

Human-Object Interaction (HOI) detection is a task of identifying "a set of interactions" in an image, which involves the i) localization of the subject (i.e., humans) and target (i.e., objects) of interaction, and ii) the classification of the interaction labels.

Libraries

Use these libraries to find Human-Object Interaction Detection models and implementations

Latest papers with no code

Generating Human Interaction Motions in Scenes with Text Control

no code yet • 16 Apr 2024

Our approach begins with pre-training a scene-agnostic text-to-motion diffusion model, emphasizing goal-reaching constraints on large-scale motion-capture datasets.

HOI-M3:Capture Multiple Humans and Objects Interaction within Contextual Environment

no code yet • 30 Mar 2024

Humans naturally interact with both others and the surrounding multiple objects, engaging in various social activities.

InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction

no code yet • 28 Mar 2024

However, extending such success to 3D dynamic human-object interaction (HOI) generation faces notable challenges, primarily due to the lack of large-scale interaction data and comprehensive descriptions that align with these interactions.

InterFusion: Text-Driven Generation of 3D Human-Object Interaction

no code yet • 22 Mar 2024

In this study, we tackle the complex task of generating 3D human-object interactions (HOI) from textual descriptions in a zero-shot text-to-3D manner.

FORCE: Dataset and Method for Intuitive Physics Guided Human-object Interaction

no code yet • 17 Mar 2024

Our key insight is that human motion is dictated by the interrelation between the force exerted by the human and the perceived resistance.

THOR: Text to Human-Object Interaction Diffusion via Relation Intervention

no code yet • 17 Mar 2024

This paper addresses new methodologies to deal with the challenging task of generating dynamic Human-Object Interactions from textual descriptions (Text2HOI).

Enhancing Human-Centered Dynamic Scene Understanding via Multiple LLMs Collaborated Reasoning

no code yet • 15 Mar 2024

Human-centered dynamic scene understanding plays a pivotal role in enhancing the capability of robotic and autonomous systems, in which Video-based Human-Object Interaction (V-HOI) detection is a crucial task in semantic scene understanding, aimed at comprehensively understanding HOI relationships within a video to benefit the behavioral decisions of mobile robots and autonomous driving systems.

Towards Zero-shot Human-Object Interaction Detection via Vision-Language Integration

no code yet • 12 Mar 2024

Human-object interaction (HOI) detection aims to locate human-object pairs and identify their interaction categories in images.

Test-time Distribution Learning Adapter for Cross-modal Visual Reasoning

no code yet • 10 Mar 2024

Several approaches aim to efficiently adapt VLP models to downstream tasks with limited supervision, aiming to leverage the acquired knowledge from VLP models.

FreeA: Human-object Interaction Detection using Free Annotation Labels

no code yet • 4 Mar 2024

Recent human-object interaction (HOI) detection approaches rely on high cost of manpower and require comprehensive annotated image datasets.