To create our dataset, we leverage a large repository of synthetic scenes created by professional artists, and we generate 77, 400 images of 461 indoor scenes with detailed per-pixel labels and corresponding ground truth geometry.
We present a dataset of large-scale indoor spaces that provides a variety of mutually registered modalities from 2D, 2. 5D and 3D domains, with instance-level semantic and geometric annotations.
In this paper, we propose a novel neural network based architecture Graph Location Networks (GLN) to perform infrastructure-free, multi-view image based indoor localization.
The key insight is that we decouple the instances from a coarsely completed semantic scene instead of a raw input image to guide the reconstruction of instances and the overall scene.
An important question in task transfer learning is to determine task transferability, i. e. given a common input domain, estimating to what extent representations learned from a source task can help in learning a target task.
In our approach we jointly estimate the number of functional units, their spatial structure, and their corresponding labels by using reversible jump MCMC (rjMCMC), a method well suited for optimization on spaces of varying dimensions (the number of structural elements).