Moreover, our model is more stable for training in a non-adversarial manner, compared to other adversarial based novelty detection methods.
In this paper, we propose a novel contrastive regularization (CR) built upon contrastive learning to exploit both the information of hazy images and clear images as negative and positive samples, respectively.
To overcome this visual-semantic discrepancy, this work proposes an objective function to re-align the distributed word embeddings with visual information by learning a neural network to map it into a new representation called visually aligned word embedding (VAWE).
In this fashion, we easily achieve nonlinear learning of potential functions on both unary and pairwise terms in CRFs.
Classifying a visual concept merely from its associated online textual source, such as a Wikipedia article, is an attractive research topic in zero-shot learning because it alleviates the burden of manually collecting semantic attributes.
The introduction of low-cost RGB-D sensors has promoted the research in skeleton-based human action recognition.