In this paper, we propose a new system to discriminatively embed the image and text to a shared visual-textual space.
In the benchmark evaluation, the proposed framework achieves the state-of-the-art performance on the CARS196, CUB200-2011, In-shop Clothes and Stanford Online Products on image retrieval tasks by a large margin compared to competing approaches.
Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image.
Deep metric learning aims to learn a function mapping image pixels to embedding feature vectors that model the similarity between images.
Query images presented to content-based image retrieval systems often have various different interpretations, making it difficult to identify the search objective pursued by the user.
Multiple instance learning (MIL) is a variation of traditional supervised learning problems where data (referred to as bags) are composed of sub-elements (referred to as instances) and only bag labels are available.
An adversarial query is an image that has been modified to disrupt content-based image retrieval (CBIR) while appearing nearly untouched to the human eye.
The analysis of natural disasters such as floods in a timely manner often suffers from limited data due to a coarse distribution of sensors or sensor failures.
We propose Information-Theoretic Active Learning (ITAL), a novel batch-mode active learning method for binary classification, and apply it for acquiring meaningful user feedback in the context of content-based image retrieval.