We propose a novel CNN architecture called ACTNET for robust instance image retrieval from large-scale datasets.
On image retrieval datasets Holidays, Oxford and MPEG, the REMAP descriptor achieves mAP of 95. 5%, 91. 5%, and 80. 1% respectively, outperforming any results published to date.
To address this issue, we present Dynamic Mode Decomposition (DMD) coupled with thresholding and blob analysis as a framework for automatic delineation of the kidney region.
Traditionally, human experts are required to manually delineate the kidney ROI across multiple images in the dynamic sequence.
Since it helps to enhance the accuracy and the consistency of the resulting interpretation, visual context reasoning is often incorporated with visual perception in current deep end-to-end visual semantic information pursuit methods.
This work addresses the problem of accurate semantic labelling of short videos.
We investigate factors controlling DNN diversity in the context of the Google Cloud and YouTube-8M Video Understanding Challenge.
A background model describes a scene without any foreground objects and has a number of applications, ranging from video surveillance to computational photography.
To this end, we propose a novel, unsupervised approach to thresholded search in Hamming space, supporting long codes (e. g. 512-bits) with a wide-range of Hamming distance radii.