MM-SEAL: A Large-scale Video Dataset of Multi-person Multi-grained Spatio-temporally Action Localization

no code implementations6 Apr 2022 Shimin Chen, Wei Li, Chen Chen, Jianyang Gu, Jiaming Chu, Xunqiang Tao, Yandong Guo

We explore the relationship between atomic actions and complex activities, finding that atomic action features can improve the complex activity localization performance.

Action Recognition Spatio-Temporal Action Localization +2

CRIS: CLIP-Driven Referring Image Segmentation

1 code implementation CVPR 2022 Zhaoqing Wang, Yu Lu, Qiang Li, Xunqiang Tao, Yandong Guo, Mingming Gong, Tongliang Liu

In addition, we present text-to-pixel contrastive learning to explicitly enforce the text feature similar to the related pixel-level features and dissimilar to the irrelevances.

Contrastive Learning Decoder +4

