1 code implementation • 3 Jul 2024 • Xiruo Jiang, Yazhou Yao, Xili Dai, Fumin Shen, Xian-Sheng Hua, Heng-Tao Shen
Deep metric learning (DML) aims to learn a discriminative high-dimensional embedding space for downstream tasks like classification, clustering, and retrieval.
no code implementations • CVPR 2024 • Zixian Gao, Xun Jiang, Xing Xu, Fumin Shen, Yujie Li, Heng Tao Shen
However in this paper we show that the potential noise in unimodal data could be well quantified and further employed to enhance more stable unimodal embeddings via contrastive learning.
1 code implementation • 15 Dec 2023 • Gensheng Pei, Fumin Shen, Yazhou Yao, Tao Chen, Xian-Sheng Hua, Heng-Tao Shen
However, existing optical flow-based methods have a significant dependency on optical flow, which results in poor performance when the optical flow estimation fails for a particular scene.
1 code implementation • 26 Nov 2023 • Yixuan Zhou, Yi Qu, Xing Xu, Fumin Shen, Jingkuan Song, HengTao Shen
In the proposed BN-WVAD, we leverage the Divergence of Feature from Mean vector (DFM) of BatchNorm as a reliable abnormality criterion to discern potential abnormal snippets in abnormal videos.
Anomaly Detection In Surveillance Videos Video Anomaly Detection
1 code implementation • 29 Aug 2023 • Yixuan Zhou, Xing Xu, Jingkuan Song, Fumin Shen, Heng Tao Shen
Unsupervised anomaly detection (UAD) attracts a lot of research interest and drives widespread applications, where only anomaly-free samples are available for training.
Ranked #7 on Anomaly Detection on MVTec AD
1 code implementation • 8 Apr 2023 • Gensheng Pei, Yazhou Yao, Fumin Shen, Dan Huang, Xingguo Huang, Heng-Tao Shen
Zero-shot video object segmentation (ZS-VOS) aims to segment foreground objects in a video sequence without prior knowledge of these objects.
1 code implementation • 4 Apr 2023 • Junzhu Mao, Yazhou Yao, Zeren Sun, Xingguo Huang, Fumin Shen, Heng-Tao Shen
Then we combine the similarity and first-order gradients of key tokens along the query dimension for token importance estimation and remove redundant key and value tokens to further reduce the inference complexity.
1 code implementation • 18 Jul 2022 • Gensheng Pei, Fumin Shen, Yazhou Yao, Guo-Sen Xie, Zhenmin Tang, Jinhui Tang
Optical flow is an easily conceived and precious cue for advancing unsupervised video object segmentation (UVOS).
1 code implementation • 20 Jun 2022 • Tao Chen, Yazhou Yao, Lei Zhang, Qiong Wang, Guo-Sen Xie, Fumin Shen
Specifically, we propose a saliency guided class-agnostic distance module to pull the intra-category features closer by aligning features to their class prototypes.
no code implementations • CVPR 2022 • Xun Jiang, Xing Xu, Jingran Zhang, Fumin Shen, Zuo Cao, Heng Tao Shen
Video events grounding aims at retrieving the most relevant moments from an untrimmed video in terms of a given natural language query.
no code implementations • CVPR 2022 • Zeren Sun, Fumin Shen, Dan Huang, Qiong Wang, Xiangbo Shu, Yazhou Yao, Jinhui Tang
Label noise has been a practical challenge in deep learning due to the strong capability of deep neural networks in fitting all training data.
2 code implementations • IEEE Transactions on Image Processing 2021 • Cairong Zhao, Yuanpeng Tu, Zhihui Lai, Fumin Shen, Heng Tao Shen, Duoqian Miao
Moreover, a novel iterative asymmetric mutual training strategy (IAMT) is proposed to alleviate drawbacks of common mutual learning, which can continuously refine the discriminative regions for SSB and extract regularized dark knowledge for two models as well.
1 code implementation • ICCV 2021 • Zeren Sun, Yazhou Yao, Xiu-Shen Wei, Yongshun Zhang, Fumin Shen, Jianxin Wu, Jian Zhang, Heng-Tao Shen
Learning from the web can ease the extreme dependence of deep learning on large-scale manually labeled datasets.
1 code implementation • CVPR 2021 • Xunguang Wang, Zheng Zhang, Baoyuan Wu, Fumin Shen, Guangming Lu
However, deep hashing networks are vulnerable to adversarial examples, which is a practical secure problem but seldom studied in hashing-based retrieval field.
1 code implementation • CVPR 2021 • Yazhou Yao, Tao Chen, GuoSen Xie, Chuanyi Zhang, Fumin Shen, Qi Wu, Zhenmin Tang, Jian Zhang
To further mine the non-salient region objects, we propose to exert the segmentation network's self-correction ability.
no code implementations • CVPR 2021 • Yazhou Yao, Zeren Sun, Chuanyi Zhang, Fumin Shen, Qi Wu, Jian Zhang, Zhenmin Tang
Due to the memorization effect in Deep Neural Networks (DNNs), training with noisy labels usually results in inferior model performance.
1 code implementation • 22 Feb 2021 • Tao Chen, GuoSen Xie, Yazhou Yao, Qiong Wang, Fumin Shen, Zhenmin Tang, Jian Zhang
Then we utilize the fused prototype to guide the final segmentation of the query image.
1 code implementation • 23 Jan 2021 • Huafeng Liu, Chuanyi Zhang, Yazhou Yao, Xiushen Wei, Fumin Shen, Jian Zhang, Zhenmin Tang
Labeling objects at a subordinate level typically requires expert knowledge, which is not always available when using random annotators.
no code implementations • 9 Nov 2020 • Jingyi Zhang, Yong Zhang, Baoyuan Wu, Yanbo Fan, Fumin Shen, Heng Tao Shen
We propose to incorporate the prior about the co-occurrence of relation pairs into the graph to further help alleviate the class imbalance issue.
2 code implementations • CVPR 2020 • Yuming Shen, Jie Qin, Jiaxin Chen, Mengyang Yu, Li Liu, Fan Zhu, Fumin Shen, Ling Shao
One bottleneck (i. e., binary codes) conveys the high-level intrinsic data structure captured by the code-driven graph to the other (i. e., continuous variables for low-level detail information), which in turn propagates the updated network feedback for the encoder to learn more discriminative binary codes.
no code implementations • 16 Sep 2019 • Huan Xiong, Mengyang Yu, Li Liu, Fan Zhu, Fumin Shen, Ling Shao
Binary optimization, a representative subclass of discrete optimization, plays an important role in mathematical optimization and has various applications in computer vision and machine learning.
no code implementations • 27 Aug 2019 • Jingran Zhang, Fumin Shen, Xing Xu, Heng Tao Shen
In this paper, we propose an efficient temporal reasoning graph (TRG) to simultaneously capture the appearance features and temporal relation between video sequences at multiple time scales.
Ranked #54 on Action Recognition on Something-Something V1
no code implementations • 27 Aug 2019 • Jingran Zhang, Fumin Shen, Xing Xu, Heng Tao Shen
It extracts this complementary information of different modality from a connection block, which aims at exploring correlations of different stream features.
Ranked #15 on Action Recognition on HMDB-51 (using extra training data)
no code implementations • 27 Aug 2019 • Zhijun Mai, Guosheng Hu, Dexiong Chen, Fumin Shen, Heng Tao Shen
Since deep networks are capable of memorizing the entire dataset, the corrupted samples generated by vanilla MixUp with a badly chosen interpolation policy will degrade the performance of networks.
no code implementations • ICCV 2019 • Shengju Qian, Kwan-Yee Lin, Wayne Wu, Yangxiaokang Liu, Quan Wang, Fumin Shen, Chen Qian, Ran He
Recent studies have shown remarkable success in face manipulation task with the advance of GANs and VAEs paradigms, but the outputs are sometimes limited to low-resolution and lack of diversity.
no code implementations • 7 Jun 2019 • Yazhou Yao, Jian Zhang, Xian-Sheng Hua, Fumin Shen, Zhenmin Tang
Recent successes in visual recognition can be primarily attributed to feature representation, learning algorithms, and the ever-increasing size of labeled training data.
no code implementations • 27 May 2019 • Yazhou Yao, Zeren Sun, Fumin Shen, Li Liu, Li-Min Wang, Fan Zhu, Lizhong Ding, Gangshan Wu, Ling Shao
To address this issue, we present an adaptive multi-model framework that resolves polysemy by visual disambiguation.
1 code implementation • CVPR 2019 • Yan Xu, Baoyuan Wu, Fumin Shen, Yanbo Fan, Yong Zhang, Heng Tao Shen, Wei Liu
Due to the sequential dependencies among words in a caption, we formulate the generation of adversarial noises for targeted partial captions as a structured output learning problem with latent variables.
no code implementations • 24 Apr 2019 • Yanli Ji, Feixiang Xu, Yang Yang, Fumin Shen, Heng Tao Shen, Wei-Shi Zheng
Besides, we propose a View-guided Skeleton CNN (VS-CNN) to tackle the problem of arbitrary-view action recognition.
1 code implementation • 25 Sep 2018 • Yadan Luo, Zi Huang, Yang Li, Fumin Shen, Yang Yang, Peng Cui
Hashing techniques are in great demand for a wide range of real-world applications such as image retrieval and network compression.
no code implementations • ECCV 2018 • Zheng Zhang, Li Liu, Jie Qin, Fan Zhu, Fumin Shen, Yong Xu, Ling Shao, Heng Tao Shen
How to economically cluster large-scale multi-view images is a long-standing problem in computer vision.
1 code implementation • ECCV 2018 • Jingyi Zhang, Fumin Shen, Li Liu, Fan Zhu, Mengyang Yu, Ling Shao, Heng Tao Shen, Luc van Gool
The generative model learns a mapping that the distributions of sketches can be indistinguishable from the distribution of natural images using an adversarial loss, and simultaneously learns an inverse mapping based on the cycle consistency loss in order to enhance the indistinguishability.
1 code implementation • ECCV 2018 • Diwen Wan, Fumin Shen, Li Liu, Fan Zhu, Jie Qin, Ling Shao, Heng Tao Shen
Despite the remarkable success of Convolutional Neural Networks (CNNs) on generalized visual tasks, high computational and memory costs restrict their comprehensive applications on consumer electronics (e. g., portable or smart wearable devices).
no code implementations • ECCV 2018 • Guosheng Hu, Li Liu, Yang Yuan, Zehao Yu, Yang Hua, Zhihong Zhang, Fumin Shen, Ling Shao, Timothy Hospedales, Neil Robertson, Yongxin Yang
To advance subtle expression recognition, we contribute a Large-scale Subtle Emotions and Mental States in the Wild database (LSEMSW).
1 code implementation • CVPR 2018 • Yuming Shen, Li Liu, Fumin Shen, Ling Shao
As an important part of ZSIH, we formulate a generative hashing scheme in reconstructing semantic knowledge representations for zero-shot retrieval.
no code implementations • ECCV 2018 • Xinyu Gong, HaoZhi Huang, Lin Ma, Fumin Shen, Wei Liu, Tong Zhang
While each view of the stereoscopic pair is processed in an individual path, a novel feature aggregation strategy is proposed to effectively share information between the two paths.
no code implementations • 15 Sep 2017 • Yanyun Qu, Li Lin, Fumin Shen, Chang Lu, Yang Wu, Yuan Xie, DaCheng Tao
We propose a novel image classification method based on learning hierarchical inter-class structures.
no code implementations • 22 Aug 2017 • Yazhou Yao, Jian Zhang, Fumin Shen, Li Liu, Fan Zhu, Dongxiang Zhang, Heng-Tao Shen
To eliminate manual annotation, in this work, we propose a novel image dataset construction framework by employing multiple textual queries.
no code implementations • CVPR 2017 • Xing Xu, Fumin Shen, Yang Yang, Dongxiang Zhang, Heng Tao Shen, Jingkuan Song
By additionally introducing manifold regularizations on visual data and semantic embeddings, the learned projection can effectively captures the geometrical manifold structure residing in both visual and semantic spaces.
no code implementations • CVPR 2017 • Li Liu, Ling Shao, Fumin Shen, Mengyang Yu
Learning to hash has been recognized to accomplish highly efficient storage and retrieval for large-scale visual data.
no code implementations • CVPR 2017 • Jie Qin, Li Liu, Ling Shao, Bingbing Ni, Chen Chen, Fumin Shen, Yunhong Wang
Extensive experiments on four realistic action datasets in terms of three tasks (i. e., partial action retrieval, recognition and prediction) clearly show the superiority of PRBC over the state-of-the-art methods, along with significantly reduced memory load and computational costs during the online test.
no code implementations • CVPR 2017 • Jie Qin, Li Liu, Ling Shao, Fumin Shen, Bingbing Ni, Jiaxin Chen, Yunhong Wang
Our ZSECOC equips the conventional ECOC with the additional capability of ZSAR, by addressing the domain shift problem.
Ranked #4 on Zero-Shot Action Recognition on Olympics
no code implementations • CVPR 2017 • Yang Long, Li Liu, Ling Shao, Fumin Shen, Guiguang Ding, Jungong Han
Using the proposed Unseen Visual Data Synthesis (UVDS) algorithm, semantic attributes are effectively utilised as an intermediate clue to synthesise unseen visual features at the training stage.
1 code implementation • CVPR 2017 • Li Liu, Fumin Shen, Yuming Shen, Xianglong Liu, Ling Shao
Free-hand sketch-based image retrieval (SBIR) is a specific cross-view retrieval task, in which queries are abstract and ambiguous sketches while the retrieval database is formed with natural images.
no code implementations • 16 Mar 2017 • Yazhou Yao, Jian Zhang, Fumin Shen, Xian-Sheng Hua, Wankou Yang, Zhenmin Tang
To tackle these problems, in this work, we exploit general corpus information to automatically select and subsequently classify web images into semantic rich (sub-)categories.
no code implementations • 3 Jan 2017 • Xinyu Wang, Hanxi Li, Yi Li, Fumin Shen, Fatih Porikli
Visual tracking is a fundamental problem in computer vision.
no code implementations • 15 Dec 2016 • Hao Liu, Yang Yang, Fumin Shen, Lixin Duan, Heng Tao Shen
Along with the prosperity of recurrent neural network in modelling sequential data and the power of attention mechanism in automatically identify salient information, image captioning, a. k. a., image description, has been remarkably advanced in recent years.
no code implementations • 6 Dec 2016 • Ruicong Xu, Yang Yang, Yadan Luo, Fumin Shen, Zi Huang, Heng Tao Shen
The first approach, termed Inner-product Binary Coding (IBC), preserves the inner relationships of images and videos in a common Hamming space.
no code implementations • 22 Nov 2016 • Yazhou Yao, Jian Zhang, Fumin Shen, Xian-Sheng Hua, Jingsong Xu, Zhenmin Tang
To reduce the cost of manual labelling, there has been increased research interest in automatically constructing image datasets by exploiting web images.
no code implementations • 16 Jun 2016 • Yang Yang, Wei-Lun Chen, Yadan Luo, Fumin Shen, Jie Shao, Heng Tao Shen
Supervised knowledge e. g. semantic labels or pair-wise relationship) associated to data is capable of significantly improving the quality of hash codes and hash functions.
no code implementations • 15 Jun 2016 • Yi Bin, Yang Yang, Zi Huang, Fumin Shen, Xing Xu, Heng Tao Shen
Video captioning has been attracting broad research attention in multimedia community.
no code implementations • 14 Mar 2016 • Fumin Shen, Yadong Mu, Wei Liu, Yang Yang, Heng Tao Shen
The optimization alternatively proceeds over the binary classifiers and image hash codes.
no code implementations • ICCV 2015 • Fumin Shen, Wei Liu, Shaoting Zhang, Yang Yang, Heng Tao Shen
Inspired by the latest advance in asymmetric hashing schemes, we propose an asymmetric binary code learning framework based on inner product fitting.
1 code implementation • CVPR 2015 • Fumin Shen, Chunhua Shen, Wei Liu, Heng Tao Shen
This paper has been withdrawn by the authour.
no code implementations • 2 Dec 2014 • Fumin Shen, Chunhua Shen, Qinfeng Shi, Anton Van Den Hengel, Zhenmin Tang, Heng Tao Shen
In addition, a supervised inductive manifold hashing framework is developed by incorporating the label information, which is shown to greatly advance the semantic retrieval performance.
no code implementations • 26 Jun 2014 • Fumin Shen, Chunhua Shen, Heng Tao Shen
We propose a very simple, efficient yet surprisingly effective feature extraction method for face recognition (about 20 lines of Matlab code), which is mainly inspired by spatial pyramid pooling in generic image classification.
no code implementations • 26 Jun 2014 • Fumin Shen, Chunhua Shen, Heng Tao Shen
Spatial pyramid pooling of features encoded by an over-complete dictionary has been the key component of many state-of-the-art image classification systems.
no code implementations • 22 Sep 2013 • Fumin Shen, Chunhua Shen
The main finding of this work is that the standard image classification pipeline, which consists of dictionary learning, feature encoding, spatial pyramid pooling and linear classification, outperforms all state-of-the-art face recognition methods on the tested benchmark datasets (we have tested on AR, Extended Yale B, the challenging FERET, and LFW-a datasets).
no code implementations • 4 Apr 2013 • Fumin Shen, Chunhua Shen, Rhys Hill, Anton Van Den Hengel, Zhenmin Tang
Minimization of the $L_\infty$ norm, which can be viewed as approximately solving the non-convex least median estimation problem, is a powerful method for outlier removal and hence robust regression.
no code implementations • CVPR 2013 • Fumin Shen, Chunhua Shen, Qinfeng Shi, Anton Van Den Hengel, Zhenmin Tang
We particularly show that hashing on the basis of t-SNE .