A Zero-Shot Sketch-based Inter-Modal Object Retrieval Scheme for Remote Sensing Images
WITH the advancement in sensor technology, huge amounts of data are being collected from various satellites. Hence, the task of target-based data retrieval and acquisition has become exceedingly challenging. Existing satellites essentially scan a vast overlapping region of the Earth using various sensing techniques, like multi-spectral, hyperspectral, Synthetic Aperture Radar (SAR), video, and compressed sensing, to name a few. With increasing complexity and different sensing techniques at our disposal, it has become our primary interest to design efficient algorithms to retrieve data from multiple data modalities, given the complementary information that is captured by different sensors. This type of problem is referred to as inter-modal data retrieval. In remote sensing (RS), there are primarily two important types of problems, i.e., land-cover classification and object detection. In this work, we focus on the target-based object retrieval part, which falls under the realm of object detection in RS. Object retrieval essentially requires high-resolution imagery for objects to be distinctly visible in the image. The main challenge with the conventional retrieval approach using large-scale databases is that, quite often, we do not have any query image sample of the target class at our disposal. The target of interest solely exists as a perception to the user in the form of an imprecise sketch. In such situations where a photo query is absent, it can be immensely useful if we can promptly make a quick hand-made sketch of the target. Sketches are a highly symbolic and hieroglyphic representation of data. One can exploit the notion of this minimalistic representative of sketch queries for sketch-based image retrieval (SBIR) framework. While dealing with satellite images, it is imperative to collect as many samples of images as possible for each object class for object recognition with a high success rate. However, in general, there exists a considerable number of classes for which we seldom have any training data samples. Therefore, for such classes, we can use the zero-shot learning (ZSL) strategy. The ZSL approach aims to solve a task without receiving any example of that task during the training phase. This makes the network capable of handling an unseen class (new class) sample obtained during the inference phase upon deployment of the network. Hence, we propose the aerial sketch-image dataset, namely Earth on Canvas dataset.
Classes in this dataset: Airplane, Baseball Diamond, Buildings, Freeway, Golf Course, Harbor, Intersection, Mobile home park, Overpass, Parking lot, River, Runway, Storage tank, Tennis court.