In this paper, we introduce a novel method, detection prompt (DetPro), to learn continuous prompt representations for open-vocabulary object detection based on the pre-trained vision-language model.
Last, to enhance the embedding space learning, an additional pixel-wise metric learning module is introduced with triplet loss formulated on the pixel-level embedding of the input image.
The system uses a hybrid of content-based and collaborative filtering techniques to rank items for editors relying on both item features and item-editor previous interaction.
In this paper, we restore the negative information in few-shot object detection by introducing a new negative- and positive-representative based metric learning framework and a new inference scheme with negative and positive representatives.
This paper presents a DNN bottleneck reinforcement scheme to alleviate the vulnerability of Deep Neural Networks (DNN) against adversarial attacks.
We formulate the mutual transformations between the outputs of regression- and detection-based models as two scene-agnostic transformers which enable knowledge distillation between the two models.
Weakly-supervised object detection attempts to limit the amount of supervision by dispensing the need for bounding boxes, but still assumes image-level labels on the entire training set.
Ranked #21 on Weakly Supervised Object Detection on PASCAL VOC 2012 test (using extra training data)
In the end, we propose a curriculum learning strategy to train the network from images of relatively accurate and easy pseudo ground truth first.
We propose to help weakly supervised object localization for classes where location annotations are not available, by transferring things and stuff knowledge from a source set with available annotations.
We present a technique for weakly supervised object localization (WSOL), building on the observation that WSOL algorithms usually work better on images with bigger objects.
Then, we show the interest of using this strategy in an asymmetrical manner, with only the database features being aggregated but not those of the query.