We then leverage degradation-aware visual prompts to establish a controllable and universal model for image restoration, called ProRes, which is applicable to an extensive range of image restoration tasks.
In this work, we propose a novel end-to-end wind power forecasting model named Hierarchical Spatial-Temporal Transformer Network (HSTTN) to address the long-term WPF problems.
Fortunately, we have identified two observations that help us achieve the best of both worlds: 1) query-based methods demonstrate superiority over dense proposal-based methods in open-world instance segmentation, and 2) learning localization cues is sufficient for open world instance segmentation.
The LCE module utilizes a graph to model the global co-occurrence relationship between multiple labels and employs graph convolutional networks for learning inference.
Conclusions: The experimental result demonstrates the effectiveness of the proposed MM-SFENet on the localization and classification of bladder cancer.
In order to get raw images of high quality for downstream Image Signal Process (ISP), in this paper we present an Efficient Locally Multiplicative Transformer called ELMformer for raw image restoration.
The invariance of illumination or inherent difference between two images is fully explored so as to make up for the lack of labels for nighttime images.
Off-the-shelf single-stage multi-person pose regression methods generally leverage the instance score (i. e., confidence of the instance localization) to indicate the pose quality for selecting the pose candidates.
Multi-person pose estimation methods generally follow top-down and bottom-up paradigms, both of which can be considered as two-stage approaches thus leading to the high computation cost and low efficiency.
Many of them settle it by generating fake frontal faces from extreme ones, whereas they are tough to maintain the identity information with high computational consumption and uncontrolled disturbances.
The outputs from the teacher network are used as soft labels for supervising the training of a new network.
Ranked #20 on Knowledge Distillation on ImageNet
In this paper, we investigate the bias-variance tradeoff brought by distillation with soft labels.
The vast majority of research in computer assisted medical coding focuses on coding at the document level, but a substantial proportion of medical coding in the real world involves coding at the level of clinical encounters, each of which is typically represented by a potentially large set of documents.
To improve the discriminative and generalization ability of lightweight network for face recognition, we propose an efficient variable group convolutional network called VarGFaceNet.
Ranked #3 on Face Verification on CFP-FP
In this paper, we propose a novel network design mechanism for efficient embedded computing.
Ranked #5 on Face Verification on CFP-FP
Furthermore, due to the lack of high-resolution face manipulation databases to verify the effectiveness of our method, we collect a new high-quality Multi-View Face (MVF-HQ) database.
It generates hashing bits by the output neurons of a deep hashing network.