Even though convolutional neural networks can classify objects in images very accurately, it is well known that the attention of the network may not always be on the semantically important regions of the scene.
The detection of vehicular targets in infra-red imagery is a challenging task, both due to the relatively few pixels on target and the false alarms produced by the surrounding terrain clutter.
We propose an efficient and straightforward method for compressing deep convolutional neural networks (CNNs) that uses basis filters to represent the convolutional layers, and optimizes the performance of the compressed network directly in the basis space.
Exploiting this fact, we aim to reduce the computations of our framework by employing a binary student network (BSN) to learn the frequently occurring classes using the pseudo-labels generated by the teacher network (TN) on an unlabeled image stream.
We further explore the non-linear feature subspace and conclude that our network does not operate in the Euclidean subspace but rather in the Riemannian subspace.
Without the need of anomalous training images, we propose Convolutional Adversarial Variational autoencoder with Guided Attention (CAVGA), which localizes the anomaly with a convolutional latent variable to preserve the spatial information.
Ranked #23 on Anomaly Detection on MVTec AD (Segmentation AUROC metric, using extra training data)
Specifically, any convolution layer of the CNN is easily replaced by two successive convolution layers: the first is a set of fixed filters (that represent the knowledge space of the entire layer and do not change), which is followed by a layer of one-dimensional filters (that represent the learned knowledge in this space).