6 papers with code • 3 benchmarks • 4 datasets
Zero-shot object detection (ZSD) is the task of object detection where no visual training data is available for some of the target object classes.
We hypothesize that this setting is ill-suited for real-world applications where unseen objects appear only as a part of a complex scene, warranting both the `recognition' and `localization' of an unseen category.
Ranked #2 on Zero-Shot Object Detection on ImageNet Detection
This setting gives rise to the need for correct alignment between visual and semantic concepts, so that the unseen objects can be identified using only their semantic attributes.
Ranked #1 on Zero-Shot Object Detection on MS-COCO
The existing zero-shot detection approaches project visual features to the semantic domain for seen objects, hoping to map unseen objects to their corresponding semantics during inference.
Ranked #1 on Zero-Shot Object Detection on PASCAL VOC'07
Object detection is considered as one of the most challenging problems in computer vision, since it requires correct prediction of both classes and locations of objects in images.
The major contributions for BLC are as follows: (i) we propose a multi-stage cascade structure named Cascade Semantic R-CNN to progressively refine the alignment between visual and semantic of ZSD; (ii) we develop the semantic information flow structure and directly add it between each stage in Cascade Semantic RCNN to further improve the semantic feature learning; (iii) we propose the background learnable region proposal network (BLRPN) to learn an appropriate word vector for background class and use this learned vector in Cascade Semantic R CNN, this design makes \Background Learnable" and reduces the confusion between background and unseen classes.