Our analysis identifies the core of these challenges as the interaction among noise levels in the 2D diffusion process, the architecture of the diffusion network, and the 3D model representation.
In contrast, for dynamic scenes, scene-specific optimization techniques exist, but, to our best knowledge, there is currently no generalized method for dynamic novel view synthesis from a given monocular video.
Recent works on 3D reconstruction from posed images have demonstrated that direct inference of scene-level 3D geometry without test-time optimization is feasible using deep neural networks, showing remarkable promise and high efficiency.
HyperDiffusion operates directly on MLP weights and generates new neural implicit fields encoded by synthesized MLP parameters.
Texture cues on 3D objects are key to compelling visual representations, with the possibility to create high visual fidelity with inherent spatial consistency across different views.
3D reconstruction of large scenes is a challenging problem due to the high-complexity nature of the solution space, in particular for generative neural networks.
In this paper, we address the problem of fast depth estimation on embedded systems.
In this work, we present new theoretical results on convolutional generative neural networks, in particular their invertibility (i. e., the recovery of input latent code given the network output).
Depth completion, the technique of estimating a dense depth image from sparse depth measurements, has a variety of applications in robotics and autonomous driving.
Ranked #6 on Depth Completion on VOID
We consider the problem of dense depth prediction from a sparse set of depth measurements and a single RGB image.
We address the following question: is it possible to reconstruct the geometry of an unknown environment using sparse and incomplete depth measurements?