The vast majority of modern consumer-grade cameras employ a rolling shutter mechanism, leading to image distortions if the camera moves during image acquisition.
Learning to find an optimal mixed precision model that can preserve accuracy and satisfy the specific constraints on model size and computation is extremely challenge due to the difficult in training a mixed precision model and the huge space of all possible bit quantizations.
In this paper, we present a novel linear algorithm to estimate the 6 DoF relative pose from consecutive frames of stereo rolling shutter (RS) cameras.
Visual attention has proven to be effective in improving the performance of person re-identification.
In this paper, we begin by first analyzing the design defects of feature pyramid in FPN, and then introduce a new feature pyramid architecture named AugFPN to address these problems.
To address these problems, we propose an Adaptive Semantic Guidance Network (ASGN), which instantiates the whole video semantics to different POS-aware semantics with the supervision of part of speech (POS) tag.
Point cloud processing is very challenging, as the diverse shapes formed by irregular points are often indistinguishable.
Ranked #7 on 3D Part Segmentation on ShapeNet-Part
Specifically, the convolutional weight for local point set is forced to learn a high-level relation expression from predefined geometric priors, between a sampled point from this point set and the others.
Ranked #14 on 3D Part Segmentation on ShapeNet-Part (Instance Average IoU metric)
Despite the fact that Second Order Similarity (SOS) has been used with significant success in tasks such as graph matching and clustering, it has not been exploited for learning local descriptors.
Instead of relying on optical flow, this paper proposes a novel module called Progressive Sparse Local Attention (PSLA), which establishes the spatial correspondence between features across frames in a local region with progressively sparser stride and uses the correspondence to propagate features.
Specifically, for confusing manmade objects, ScasNet improves the labeling coherence with sequential global-to-local contexts aggregation.
To obtain a comprehensive evaluation, we choose to include both float type features and binary ones.
Binary features have been incrementally popular in the past few years due to their low memory footprints and the efficient computation of Hamming distance between binary descriptors.
Subset selection from massive data with noised information is increasingly popular for various applications.
Ranked #6 on Named Entity Recognition on SciERC (using extra training data)
Based on this observation, we exploit a learning-based sparsity method to simultaneously learn the HU results and a sparse guidance map.
With this constraint, our method can learn a compact space, where highly similar pixels are grouped to share correlated sparse representations.
Hyperspectral unmixing, the process of estimating a common set of spectral bases and their corresponding composite percentages at each pixel, is an important task for hyperspectral analysis, visualization and understanding.