Visual-Spatial-Graph Network (VSGNet) is a network for human-object interaction detection. It extracts visual features from the image representing the human-object pair, refines the features with spatial configurations of the pair, and utilizes the structural connections between the pair via graph convolutions.
Source: VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph ConvolutionsPaper | Code | Results | Date | Stars |
---|
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |