Videos

V-HICO

Introduced by Li et al. in Weakly Supervised Human-Object Interaction Detection in Video via Contrastive Spatiotemporal Regions

V-HICO is a dataset for human-object interaction in videos. There are 6,594 videos, including 5,297 training videos, 635 validation videos, 608 test videos, and 54 unseen test videos, of human-object interaction. To test the performance of models on common human-object interaction classes and generalization to new human-object interaction classes, we provide two test splits, the first one has the same human-object interaction classes in the training split while the second one consists of unseen novel classes.

V-HICO consists of 244 object classes and 99 action classes. There are 756 action-object pairwise classes in total. The unseen test dataset contains 51 object classes and 32 action classes with 52 action-object pairwise classes. All videos are labeled with text annotations of the human action and the associated object. The test and unseen dataset contain the annotations of both human and object bounding boxes.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

activeloopai/Hub

7,703

V-HICO

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

VidSTG

Usage

License

Modalities

Languages

V-HICO

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

VidSTG

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages