We note that the best existing multi-label ZSL method takes a shared approach towards attending to region features with a common set of attention maps for all the classes.
Ranked #1 on Multi-label zero-shot learning on NUS-WIDE
The need to address the scarcity of task-specific annotated data has resulted in concerted efforts in recent years for specific settings such as zero-shot learning (ZSL) and domain generalization (DG), to separately address the issues of semantic shift and domain shift, respectively.
Nevertheless, computing reliable attention maps for unseen classes during inference in a multi-label setting is still a challenge.
Ranked #3 on Multi-label zero-shot learning on NUS-WIDE
The proposed formulation comprises a discriminative and a denoising loss term for enhancing temporal action localization.
Ranked #2 on Weakly Supervised Action Localization on THUMOS’14
We propose to enforce semantic consistency at all stages of (generalized) zero-shot learning: training, feature synthesis and classification.
Ranked #1 on Zero-Shot Learning on CUB-200-2011
Our joint formulation has three terms: a classification term to ensure the separability of learned action features, an adapted multi-label center loss term to enhance the action feature discriminability and a counting loss term to delineate adjacent action sequences, leading to improved localization.
Ranked #1 on Action Classification on THUMOS’14
We introduce an out-of-distribution detector that determines whether the video features belong to a seen or unseen action category.
Similarly on the Strecha dataset, we see an improvement of 3-5% for the matching task in non-planar scenes.
Scenes from the Oxford ACRD, MVS and Synthetic datasets are used for evaluating the patch matching performance of the learnt descriptors while the Strecha dataset is used to evaluate the 3D reconstruction task.
Different object-parts have varying degrees of interactions with the other parts during an action cycle.