The critical challenge of Semi-Supervised Learning (SSL) is how to effectively leverage the limited labeled data and massive unlabeled data to improve the model's generalization performance.
In this work, we propose a Physics-Informed Deep Diffusion magnetic resonance imaging (DWI) reconstruction method (PIDD).
Moreover, the partial-observation management policies are directly deployable in the real world as they use readily available information.
Specifically, a channel separation-aggregation (CSA) structure is designed to simplify the complexity of stacked separable convolutions, and a dynamic receptive field (DRF) mechanism is developed to maintain high accuracy by customizing the convolution kernel and its perception range dynamically when reducing the network complexity.
However, the inconsistent features for the localization and classification tasks in AOOD models may lead to ambiguity and low-quality object predictions, which constrains the detection performance.
Text information including extensive prior knowledge about land cover classes has been ignored in hyperspectral image classification (HSI) tasks.
Currently, cross-scene hyperspectral image (HSI) classification has drawn increasing attention.
Beyond classification, Conv-Adapter can generalize to detection and segmentation tasks with more than 50% reduction of parameters but comparable performance to the traditional full fine-tuning.
1 code implementation • 12 Aug 2022 • Yidong Wang, Hao Chen, Yue Fan, Wang Sun, Ran Tao, Wenxin Hou, RenJie Wang, Linyi Yang, Zhi Zhou, Lan-Zhe Guo, Heli Qi, Zhen Wu, Yu-Feng Li, Satoshi Nakamura, Wei Ye, Marios Savvides, Bhiksha Raj, Takahiro Shinozaki, Bernt Schiele, Jindong Wang, Xing Xie, Yue Zhang
We further provide the pre-trained versions of the state-of-the-art neural models for CV tasks to make the cost affordable for further tuning.
To this end, a multi-patch attention network (MPANet) based on the axial-attention encoder and the multi-scale patch branch (MSPB) structure is proposed.
Nitrogen (N) management is critical to sustain soil fertility and crop production while minimizing the negative environmental impact, but is challenging to optimize.
Class-agnostic bias is defined as the distribution shifting introduced by domain difference, which we propose Distribution Calibration Module(DCM) to reduce.
Considering the characteristics and differences of multi-source remote sensing images, a feature-based registration algorithm named Multi-scale Histogram of Local Main Orientation (MS-HLMO) is proposed.
Conclusion: The explicit phase model PAIR with complementary priors has a good performance on challenging reconstructions under inter-shot motions between shots and a low signal-to-noise ratio.
Spatial-query-by-sketch is an intuitive tool to explore human spatial knowledge about geographic environments and to support communication with scene database queries.
Owing to effective and flexible data acquisition, unmanned aerial vehicle (UAV) has recently become a hotspot across the fields of computer vision (CV) and remote sensing (RS).
Specifically, an anchor-free object-adaptation label assignment (OLA) strategy is presented to define the positive candidates based on two-dimensional (2-D) oriented Gaussian heatmaps, which reflect the shape and direction features of arbitrary-oriented objects.
Ranked #24 on Object Detection In Aerial Images on DOTA
In this paper, a dynamic proximal unrolling network (dubbed DPUNet) was proposed, which can handle a variety of measurement matrices via one single model without retraining.
We propose a method to disentangle linear-encoded facial semantics from StyleGAN without external supervision.
By addressing the difference between feature distributions of base and novel classes, we propose the adaptive feature distribution method which is to finetune one scale vector using the support set of novel classes.
In addition, the generalization ability of Ms-AFt in dense remote sensing scenes is further verified on stereo aerial imagery of a large camping site.
To this end, we propose a novel and efficient framework for geospatial object detection in this letter, called Fourier-based rotation-invariant feature boosting (FRIFB).
Particularly, long short-term memory (LSTM), as a special deep learning structure, has shown great ability in modeling long-term dependencies in the time dimension of video or the spectral dimension of HSIs.
To this end, we propose a novel object detection framework, called optical remote sensing imagery detector (ORSIm detector), integrating diverse channel features extraction, feature learning, fast image pyramid matching, and boosting strategy.
The generated orbit in the latent space records all the differences in pose in the original observational space, and as a result, the method is capable of finding subtle differences in pose.
We introduce the OxUvA dataset and benchmark for evaluating single-object tracking algorithms.
This paper introduces a novel anchor design to support anchor-based face detection for superior scale-invariant performance, especially on tiny faces.
This paper strives to track a target object in a video.
Ranked #14 on Referring Expression Segmentation on J-HMDB
Searching among instances from the same category as the query, the category-specific attributes outperform existing approaches by a large margin on shoes and cars and perform on par with the state-of-the-art on buildings.
In this paper we present a tracker, which is radically different from state-of-the-art trackers: we apply no model updating, no occlusion detection, no combination of trackers, no geometric matching, and still deliver state-of-the-art tracking performance, as demonstrated on the popular online tracking benchmark (OTB) and six very challenging YouTube videos.
This paper aims for generic instance search from one example where the instance can be an arbitrary 3D object like shoes, not just near-planar and one-sided instances like buildings and logos.