However, jointly removing the rain and haze in scene images is ill-posed and challenging, where the existence of haze and rain and the change of atmosphere light, can both degrade the scene information.
In this paper, we present and study a new image segmentation task, called Generalized Open-set Semantic Segmentation (GOSS).
However, these methods often demand large scale and high quality counseling data, which are difficult to collect.
We present You Only Cut Once (YOCO) for performing data augmentations.
A principle way of achieving few-shot learning is to realize a model that can rapidly adapt to the context of a given task.
To this end, we formulate the metric as a weighted sum on the tangent bundle of the hyperbolic space and develop a mechanism to obtain the weights adaptively and based on the constellation of the points.
Inspired by those observations, we propose a novel visual saliency method, termed Target-Selective Gradient Backprop (TSGB), which leverages rectification operations to effectively emphasize target classes and further efficiently propagate the saliency to the image space, thereby generating target-selective and fine-grained saliency maps.
The core concept of GNNs is to find a representation by recursively aggregating the representations of a central node and those of its neighbors.
We present and study a novel task named Blind Image Decomposition (BID), which requires separating a superimposed image into constituent underlying images in a blind setting, that is, both the source components involved in mixing as well as the mixing mechanism are unknown.
The current state-of-the-art methods are mainly based on the encoding paradigm called Cross-Encoder, which separately encodes each context-response pair and ranks the responses according to their fitness scores.
Few-shot learning aims to correctly recognize query samples from unseen classes given a limited number of support samples, often by relying on global embeddings of images.
Few-shot class incremental learning (FSCIL) portrays the problem of learning new concepts gradually, where only a few examples per concept are available to the learner.
In this paper, we propose addressing this problem using a mixture of subspaces.
However, working in hyperbolic spaces is not without difficulties as a result of its curved geometry (e. g., computing the Frechet mean of a set of points requires an iterative algorithm).
Modern video person re-identification (re-ID) machines are often trained using a metric learning approach, supervised by a triplet loss.
Full attention, which generates an attention value per element of the input feature maps, has been successfully demonstrated to be beneficial in visual tasks.
Deep neural networks need to make robust inference in the presence of occlusion, background clutter, pose and viewpoint variations -- to name a few -- when the task of person re-identification is considered.
This paper investigates a novel Bilinear attention (Bi-attention) block, which discovers and uses second order statistical information in an input feature map, for the purpose of person retrieval.