This motivates the use of commodity sensors like cameras for data collection, which are an order of magnitude cheaper than HD sensor suites, but offer lower fidelity.
Despite the numerous successes of machine learning over the past decade (image recognition, decision-making, NLP, image synthesis), self-driving technology has not yet followed the same trend.
We train our system directly from 1, 000 hours of driving logs and measure both realism, reactivity of the simulation as the two key properties of the simulation.
If cheaper sensors could be used for collection instead, data availability would go up, which is crucial in a field where data volume requirements are large and availability is small.
We demonstrate our method for the task of visual localization of a query image within a database of images with known pose.
In this paper we present the first published end-to-end production computer-vision system for powering city-scale shared augmented reality experiences on mobile devices.
We propose a motion-based method to discover the physical parts of an articulated object class (e. g. head/torso/leg of a horse) from multiple videos.
Given unstructured videos of deformable objects, we automatically recover spatiotemporal correspondences to map one object to another (such as animals in the wild).
We propose an unsupervised approach for discovering characteristic motion patterns in videos of highly articulated objects performing natural, unscripted behaviors, such as tigers in the wild.
For example, we model a chair as a set of four legs, a seat and a backrest.