We address weakly supervised point cloud segmentation by proposing a new model, MIL-derived transformer, to mine additional supervisory signals.
Existing video stabilization methods often generate visible distortion or require aggressive cropping of frame boundaries, resulting in smaller field of views.
We formulate this task as an object point sampling problem, and develop two techniques, the mutual attention module and co-contrastive learning, to enable it.
We present a learning-based approach for removing unwanted obstructions, such as window reflections, fence occlusions, or adherent raindrops, from a short sequence of images captured by a moving camera.
For addressing this issue, this paper leverages domain-specific mappings for remapping latent features in the shared content space to domain-specific content spaces.
Modern cameras have limited dynamic ranges and often produce images with saturated or dark regions using a single exposure.
We model the HDRto-LDR image formation pipeline as the (1) dynamic range clipping, (2) non-linear mapping from a camera response function, and (3) quantization.
We present a learning-based approach for removing unwanted obstructions, such as window reflections, fence occlusions or raindrops, from a short sequence of images captured by a moving camera.
This paper introduces a novel deep network for estimating depth maps from a light field image.
Ranked #1 on Depth Estimation on 4D Light Field Dataset
This paper presents a weakly supervised instance segmentation method that consumes training data with tight bounding box annotations.
To demonstrate the illumination issue and to evaluate our model, we construct two large-scale simulated datasets with a wide range of illumination variations.
In this paper, we address co-saliency detection in a set of images jointly covering objects of a specific class by an unsupervised convolutional neural network (CNN).
The proposed models and the thorough experiments together demonstrate that CNN is an effective and versatile tool for solving the demosaicing problem.
This paper presents the DeepCD framework which learns a pair of complementary descriptors jointly for a patch by employing deep learning techniques.
We tackle the three issues by introducing a new network layer, called co-occurrence layer.
Experiments on popular benchmarks demonstrate the effectiveness of our descriptors and their superiority to the state-of-the-art descriptors.
First, the performance of descriptor-based approaches to image alignment relies on the chosen descriptor, but the optimal descriptor typically varies from image to image, or even pixel to pixel.