Large-scale pretraining of visual representations has led to state-of-the-art performance on a range of benchmark computer vision tasks, yet the benefits of these techniques at extreme scale in complex production systems has been relatively unexplored.
Perceptual learning approaches like perceptual loss are empirically powerful for such tasks but they usually rely on the pre-trained classification network to provide features, which are not necessarily optimal in terms of visual perception of image transformation.
As online content becomes ever more visual, the demand for searching by visual queries grows correspondingly stronger.
However, plenty of studies have shown that global information is crucial for image restoration tasks like image demosaicing and enhancing.
We present an open-set logo detection (OSLD) system, which can detect (localize and recognize) any number of unseen logo classes without re-training; it only requires a small set of canonical logo images for each logo class.
The solution we present not only allows us to train for multiple application objectives in a single deep neural network architecture, but takes advantage of correlated information in the combination of all training data from each application to generate a unified embedding that outperforms all specialized embeddings previously deployed for each product.
Deep metric learning aims to learn a function mapping image pixels to embedding feature vectors that model the similarity between images.
Ranked #3 on Image Retrieval on CARS196