We introduce Retrieval Augmented Classification (RAC), a generic approach to augmenting standard image classification pipelines with an explicit retrieval module.
Ranked #2 on Long-tail Learning on iNaturalist 2018
In this study, we demonstrate that it is possible to pinpoint the location-of-recording to a certain geographical resolution using power signal recordings containing strong ENF traces.
More surprisingly, they show that the well-trained networks enable scale-consistent predictions over long videos, while the accuracy is still inferior to traditional methods because of ignoring geometric information.
One solution to this problem is to learn a deep neural network to infer the pose of a query image after learning on a dataset of images with known poses.
The advent of generative adversarial networks (GAN) has enabled new capabilities in synthesis, interpolation, and data augmentation heretofore considered very challenging.
In this work we present a self-supervised learning framework to simultaneously train two Convolutional Neural Networks (CNNs) to predict depth and surface normals from a single image.
Ranked #36 on Monocular Depth Estimation on KITTI Eigen split
Crucially, we obtain the confidence weights that parameterize the CRF model in a data-dependent manner via Convolutional Neural Networks (CNNs) which are trained to model the conditional depth error distributions given each source of input depth map and the associated RGB image.
Despite learning based methods showing promising results in single view depth estimation and visual odometry, most existing approaches treat the tasks in a supervised manner.
Visual SLAM (Simultaneous Localization and Mapping) methods typically rely on handcrafted visual features or raw RGB values for establishing correspondences between images.
In the process, meaningful feature spaces are learned for each domain, the distances in which can be used for the task of place recognition.
Rare diseases are very difficult to identify among large number of other possible diagnoses.
In this work we propose a unsupervised framework to learn a deep convolutional neural network for single view depth predic- tion, without requiring a pre-training stage or annotated ground truth depths.
Additionally, we evaluate our method on the challenging problem of Non-Rigid Structure from Motion and our approach delivers promising results on CMU mocap dataset despite the presence of significant occlusions and noise.
This paper offers the first variational approach to the problem of dense 3D reconstruction of non-rigid surfaces from a monocular video sequence.