In contrast to close-set scenarios that restore images from a predefined set of degradations, open-set image restoration aims to handle the unknown degradations that were unforeseen during the pretraining phase, which is less-touched as far as we know.
Existing point cloud completion methods tend to generate global shape skeletons and hence lack fine local details.
Our method, which we call ABLE-NeRF, significantly reduces `blurry' glossy surfaces in rendering and produces realistic translucent surfaces which lack in prior art.
Vision Transformers have shown promising performance in image restoration, which usually conduct window- or channel-based attention to avoid intensive computations.
We propose IntegratedPIFu, a new pixel aligned implicit model that builds on the foundation set by PIFuHD.
In a point cloud sequence, 3D object tracking aims to predict the location and orientation of an object in consecutive frames given an object template.
In a point cloud sequence, 3D object tracking aims to predict the location and orientation of an object in the current search point cloud given a template point cloud.
Specifically, we contribute GTA-Human, a large-scale 3D human dataset generated with the GTA-V game engine, featuring a highly diverse set of subjects, actions, and scenarios.
Generating an interpretable and compact representation of 3D shapes from point clouds is an important and challenging problem.
In addition, existing 3D domain adaptive detection methods often assume prior access to the target domain annotations, which is rarely feasible in the real world.
By comprehensively investigating these GE-ViTs and comparing with their corresponding CNN models, we observe: 1) For the enhanced model, larger ViTs still benefit more for the OOD generalization.
In contrast to previous fully supervised approaches, in this paper we present ShapeInversion, which introduces Generative Adversarial Network (GAN) inversion to shape completion for the first time.
In particular, we propose a dual-path architecture to enable principled probabilistic modeling across partial and complete clouds.
Ranked #2 on Point Cloud Completion on Completion3D
Person Re-Identification (Re-ID) is of great importance to the many video surveillance systems.
We observe that during training, the relationship proposal distribution is highly imbalanced: most of the negative relationship proposals are easy to identify, e. g., the inaccurate object detection, which leads to the under-fitting of low-frequency difficult proposals.
Panoptic segmentation aims at generating pixel-wise class and instance predictions for each pixel in the input image, which is a challenging task and far more complicated than naively fusing the semantic and instance segmentation results.
Ranked #11 on Panoptic Segmentation on COCO test-dev
To alleviate the resource constraint for real-time point cloud applications that run on edge devices, in this paper we present BiPointNet, the first model binarization approach for efficient deep learning on point clouds.
We present McAssoc, a deep learning approach to the as-sociation of detection bounding boxes in different views ofa multi-camera system.
We present an interesting and challenging dataset that features a large number of scenes with messy tables captured from multiple camera views.
In our experiments, we demonstrate that Balanced Meta-Softmax outperforms state-of-the-art long-tailed classification solutions on both visual recognition and instance segmentation tasks.
Ranked #6 on Long-tail Learning on CIFAR-10-LT (ρ=10)
Ever since the prevalent use of the LiDARs in autonomous driving, tremendous improvements have been made to the learning on the point clouds.
In this paper, we present STAR, a Spatio-Temporal grAph tRansformer framework, which tackles trajectory prediction by only attention mechanisms.
Dot-product attention has wide applications in computer vision and natural language processing.
Ranked #2 on Extractive Text Summarization on GovReport
Our proposed FD-GAN achieves state-of-the-art performance on three person reID datasets, which demonstrates that the effectiveness and robust feature distilling capability of the proposed FD-GAN.
Ranked #3 on Person Re-Identification on CUHK03
Pedestrian analysis plays a vital role in intelligent video surveillance and is a key component for security-centric computer vision systems.
Ranked #2 on Pedestrian Attribute Recognition on RAP
The proposed framework contains hierarchical deep architecture, including the frame-level sequence modeling part and the video-level classification part.
Person re-identification (ReID) is an important task in video surveillance and has various applications.