Due to the difficulty of collecting and annotating training data in this setting, we propose a physically based method to simulate the effect of snowfall on real clear-weather LiDAR point clouds.
Ranked #1 on 3D Object Detection on Heavy Snowfall
Conventional ToF imaging methods estimate the depth by sending pulses of light into a scene and measuring the ToF of the first-arriving photons directly reflected from a scene surface without any temporal delay.
Various defense methods have proposed image-to-image mapping methods, either including these perturbations in the training process or removing them in a preprocessing denoising step.
Gated cameras hold promise as an alternative to scanning LiDAR sensors with high-resolution 3D depth that is robust to back-scatter in fog, snow, and rain.
Modern smartphones can continuously stream multi-megapixel RGB images at 60Hz, synchronized with high-quality 3D pose information and low-resolution LiDAR-driven depth estimates.
Holographic displays can generate light fields by dynamically modulating the wavefront of a coherent beam of light using a spatial light modulator, promising rich virtual and augmented reality applications.
Traditionally, hardware ISP settings used by downstream vision modules have been chosen by domain experts.
This allows us, for the first time, to decompose scene light transport into temporal, spatial, and complete polarimetric dimensions that unveil scene properties hidden to conventional methods.
In this work, we close this gap and propose a computational imaging method for all-optical free-space correlation before photo-conversion that achieves micron-scale depth resolution with robustness to surface reflectance and ambient light with conventional silicon intensity sensors.
Recent neural rendering methods have demonstrated accurate view interpolation by predicting volumetric density and color with a neural network.
We introduce Mask-ToF, a method to reduce flying pixels (FP) in time-of-flight (ToF) depth captures.
AutoBots can produce either the trajectory of one ego-agent or a distribution over the future trajectories for all agents in the scene.
Most of today's supervised imaging and vision approaches, however, rely on training data collected in the real world that is biased towards good weather conditions, with dense fog, snow, and heavy rain as outliers in these datasets.
Today's state-of-the-art methods for 3D object detection are based on lidar, stereo, or monocular cameras.
Active stereo cameras that recover depth from structured light captures have become a cornerstone sensor modality for 3D scene reconstruction and understanding tasks across application domains.
Recent implicit neural rendering methods have demonstrated that it is possible to learn accurate view synthesis for complex scenes by predicting their volumetric density and color supervised solely by a set of RGB images.
In this work, we depart from visible-wavelength approaches and demonstrate detection, classification, and tracking of hidden objects in large-scale dynamic environments using Doppler radars that can be manufactured at low-cost in series production.
This work introduces an evaluation benchmark for depth estimation and completion using high-resolution depth measurements with angular resolution of up to 25" (arcsecond), akin to a 50 megapixel camera with per-pixel depth available.
We present a fully automatic system to optimize the parameters of black-box hardware and software image processing pipelines according to any arbitrary (i. e., application-specific) metric.
The fusion of multimodal sensor streams, such as camera, lidar, and radar measurements, plays a critical role in object detection for autonomous vehicles, which base their decision making on these inputs.
The proposed replacement for scanning lidar systems is real-time, handles back-scatter and provides dense depth at long ranges.
In this work, we address the lack of 3D understanding of generative neural networks by introducing a persistent 3D feature embedding for view synthesis.
Relying on consumer color image sensors, with high fill factor, high quantum efficiency and low read-out noise, we demonstrate high-fidelity color NLOS imaging for scene configurations tackled before with picosecond time resolution.
Current HDR acquisition techniques are based on either (i) fusing multibracketed, low dynamic range (LDR) images, (ii) modifying existing hardware and capturing different exposures simultaneously with multiple sensors, or (iii) reconstructing a single image with spatially-varying pixel exposures.
Imaging objects obscured by occluders is a significant challenge for many applications.
Convolutional sparse coding (CSC) is a promising direction for unsupervised learning in computer vision.
Computer vision algorithms build on 2D images or 3D videos that capture dynamic events at the millisecond time scale.
A broad class of problems at the core of computational imaging, sensing, and low-level computer vision reduces to the inverse problem of extracting latent images that follow a prior distribution, from measurements taken under a known physical image formation model.
Computational photography encompasses a diversity of imaging techniques, but one of the core operations performed by many of them is to compute image differences.
Recently, several discriminative learning approaches have been proposed for effective image restoration, achieving convincing trade-off between image quality and computational efficiency.
As such, conventional imaging involves processing the RAW sensor measurements in a sequential pipeline of steps, such as demosaicking, denoising, deblurring, tone-mapping and compression.
We propose a material classification method using raw time-of-flight (ToF) measurements.
Continuous-wave time-of-flight (ToF) cameras show great promise as low-cost depth image sensors in mobile applications.
The functional difference between a diffuse wall and a mirror is well understood: one scatters back into all directions, and the other one preserves the directionality of reflected light.