This paper aims to investigate representation learning for large scale visual place recognition, which consists of determining the location depicted in a query image by referring to a database of reference images.
In this work, we propose to use simulated forest environments to automatically generate 43 k realistic synthetic images with pixel-level annotations, and use it to train deep learning algorithms for tree detection.
Using our dataset, we then compare three neural network architectures on the task of individual logs detection and segmentation; two region-based methods and one attention-based method.
We prove new generalization bounds for stochastic gradient descent for both the convex and non-convex case.
This paper introduces the Indian Chefs Process (ICP), a Bayesian nonparametric prior on the joint space of infinite directed acyclic graphs (DAGs) and orders that generalizes Indian Buffet Processes.
In this context, we propose a generic 2D object instance detection approach that uses example viewpoints of the target object at test time to retrieve its 2D location in RGB images, without requiring any additional training (i. e. fine-tuning) step.
One approach to try to exploit such understanding would be to then make the bias explicit in the loss function.
For localization and mapping, we employ an efficient direct tracking on the truncated signed distance function (TSDF) and leverage color information encoded in the TSDF to estimate the pose of the sensor.
To this effect, we present Grasp Quality Spatial Transformer Network (GQ-STN), a one-shot grasp detection network.
However, without a large scale comparison of solutions to filter outliers, it is becoming tedious to select an appropriate algorithm for a given application.
The fusion of Iterative Closest Point (ICP) reg- istrations in existing state estimation frameworks relies on an accurate estimation of their uncertainty.
We then show that the performance of the detector can be substantially improved by using a small set of weakly annotated real images, where a human provides only a list of objects present in each image without indicating the location of the objects.
Tree species identification using bark images is a challenging problem that could prove useful for many forestry related tasks.
We show that CNNs connected with our Deep Collaboration obtain better accuracy on facial landmark detection with related tasks.
The ability to grasp ordinary and potentially never-seen objects is an important feature in both domestic and industrial robotics.
Object recognition is an important task for improving the ability of visual systems to perform complex scene understanding.
We applied our technique to American Sign Language fingerspelling classification using a Deep Belief Network, for which our feature extraction technique is tailored.