We present Neural Mixtures of Planar Experts (NeurMiPs), a novel planar-based scene representation for modeling geometry and appearance.
In this paper, we propose Bundle-Adjusting Neural Radiance Fields (BARF) for training NeRF from imperfect (or even unknown) camera poses -- the joint problem of learning neural 3D representations and registering camera frames.
Specifically, at each iteration, the neural network takes the feedback as input and outputs an update on the current estimation.
Reconstructing high-quality 3D objects from sparse, partial observations from a single view is of crucial importance for various applications in computer vision, robotics, and graphics.
Constructing and animating humans is an important component for building virtual worlds in a wide variety of applications such as virtual reality or robotics testing in simulation.
Standard convolutional neural networks assume a grid structured input is available and exploit discrete convolutions as their fundamental building blocks.
Ranked #10 on Semantic Segmentation on S3DIS Area5 (mAcc metric)
Towards this goal, in this paper we propose a bottom up approach where given a single click for each object in a video, we obtain the segmentation masks of these objects in the full video.
One of the fundamental challenges to scale self-driving is being able to create accurate high definition maps (HD maps) with low cost.
Creating high definition maps that contain precise information of static elements of the scene is of utmost importance for enabling self driving cars to drive safely.
We then incorporate the reconstructed pedestrian assets bank in a realistic LiDAR simulation system by performing motion retargeting, and show that the simulated LiDAR data can be used to significantly reduce the amount of annotated real-world data required for visual perception tasks.
3D shape completion for real data is important but challenging, since partial point clouds acquired by real-world sensors are usually sparse, noisy and unaligned.
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
Obtaining precise instance segmentation masks is of high importance in many modern applications such as robotic manipulation and autonomous driving.
We first utilize ray casting over the 3D scene and then use a deep neural network to produce deviations from the physics-based simulation, producing realistic LiDAR point clouds.
In this paper, we propose PolyTransform, a novel instance segmentation algorithm that produces precise, geometry-preserving masks by combining the strengths of prevailing segmentation approaches and modern polygon-based methods.
Ranked #1 on Instance Segmentation on Cityscapes test (using extra training data)
Our goal is to significantly speed up the runtime of current state-of-the-art stereo algorithms to enable real-time inference.
no code implementations • 8 Aug 2019 • Wei-Chiu Ma, Ignacio Tartavull, Ioan Andrei Bârsan, Shenlong Wang, Min Bai, Gellert Mattyus, Namdar Homayounfar, Shrinidhi Kowshika Lakshmikanth, Andrei Pokrovsky, Raquel Urtasun
In this paper we propose a novel semantic localization algorithm that exploits multiple sensors and has precision on the order of a few centimeters.
On the other hand, 3D convolution wastes a large amount of memory on mostly unoccupied 3D space, which consists of only the surface visible to the sensor.
At inference time, our model can be easily reduced to a single stream module that performs intrinsic decomposition on a single input image.
In this paper we present a robust, efficient and affordable approach to self-localization which does not require neither GPS nor knowledge about the appearance of the world.
We develop predictive models of pedestrian dynamics by encoding the coupled nature of multi-pedestrian interaction using game theory, and deep learning-based visual analysis to estimate person-specific behavior parameters.
Furthermore, we develop a hierarchical extension to the DPP clustering algorithm and show that it can be used to discover appearance-based grasp taxonomies.