1 code implementation • 12 Mar 2024 • De Cheng, Yanling Ji, Dong Gong, Yan Li, Nannan Wang, Junwei Han, Dingwen Zhang
It considers the characteristics of the image restoration task with multiple degenerations in continual learning, and the knowledge for different degenerations can be shared and accumulated in the unified network structure.
no code implementations • 1 Feb 2024 • Lingfeng He, De Cheng, Nannan Wang, Xinbo Gao
In response, we introduce a Modality-Unified Label Transfer (MULT) module that simultaneously accounts for both homogeneous and heterogeneous fine-grained instance-level structures, yielding high-quality cross-modality label associations.
1 code implementation • 11 Dec 2023 • Yubin Wang, Xinyang Jiang, De Cheng, Dongsheng Li, Cairong Zhao
To address this limitation and prioritize harnessing structured knowledge, this paper advocates for leveraging LLMs to build a graph for each description to model the entities and attributes describing the category, as well as their correlations.
Ranked #1 on Prompt Engineering on ImageNet V2
no code implementations • 5 Dec 2023 • Guozhang Li, Xinpeng Ding, De Cheng, Jie Li, Nannan Wang, Xinbo Gao
To further clarify the noise of expanded boundaries, we combine mutual learning with a tailored proposal-level contrastive objective to use a learnable approach to harmonize a balance between incomplete yet clean (initial) and comprehensive yet noisy (expanded) boundaries for more precise ones.
1 code implementation • 24 Aug 2023 • Shizhou Zhang, Qingchun Yang, De Cheng, Yinghui Xing, Guoqiang Liang, Peng Wang, Yanning Zhang
In this work, we construct a large-scale dataset for Ground-to-Aerial Person Search, named G2APS, which contains 31, 770 images of 260, 559 annotated bounding boxes for 2, 644 identities appearing in both of the UAVs and ground surveillance cameras.
no code implementations • 22 May 2023 • De Cheng, Xiaojian Huang, Nannan Wang, Lingfeng He, Zhihui Li, Xinbo Gao
Unsupervised learning visible-infrared person re-identification (USL-VI-ReID) aims at learning modality-invariant features from unlabeled cross-modality dataset, which is crucial for practical applications in video surveillance systems.
no code implementations • 22 May 2023 • De Cheng, Lingfeng He, Nannan Wang, Shizhou Zhang, Zhen Wang, Xinbo Gao
To this end, we propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
1 code implementation • CVPR 2023 • Guozhang Li, De Cheng, Xinpeng Ding, Nannan Wang, Xiaoyu Wang, Xinbo Gao
For the discriminative objective, we propose a Text-Segment Mining (TSM) mechanism, which constructs a text description based on the action class label, and regards the text as the query to mine all class-related segments.
1 code implementation • 25 Apr 2023 • Guozhang Li, De Cheng, Xinpeng Ding, Nannan Wang, Jie Li, Xinbo Gao
The proposed Bi-SCC firstly adopts a temporal context augmentation to generate an augmented video that breaks the correlation between positive actions and their co-scene actions in the inter-video; Then, a semantic consistency constraint (SCC) is used to enforce the predictions of the original video and augmented video to be consistent, hence suppressing the co-scene actions.
Weakly-supervised Temporal Action Localization Weakly Supervised Temporal Action Localization
no code implementations • 30 Nov 2022 • De Cheng, Haichun Tai, Nannan Wang, Zhen Wang, Xinbo Gao
In this paper, we propose a Neighbour Consistency guided Pseudo Label Refinement (NCPLR) framework, which can be regarded as a transductive form of label propagation under the assumption that the prediction of each example should be similar to its nearest neighbours'.
1 code implementation • NIPS 2022 • De Cheng, Yixiong Ning, Nannan Wang, Xinbo Gao, Heng Yang, Yuxuan Du, Bo Han, Tongliang Liu
We show that the cycle-consistency regularization helps to minimize the volume of the transition matrix T indirectly without exploiting the estimated noisy class posterior, which could further encourage the estimated transition matrix T to converge to its optimal solution.
1 code implementation • 17 Aug 2022 • Yinghui Xing, Qirui Wu, De Cheng, Shizhou Zhang, Guoqiang Liang, Peng Wang, Yanning Zhang
To make the final image feature concentrate more on the target visual concept, a Class-Aware Visual Prompt Tuning (CAVPT) scheme is further proposed in our DPT, where the class-aware visual prompt is generated dynamically by performing the cross attention between text prompts features and image patch token embeddings to encode both the downstream task-related information and visual instance information.
no code implementations • CVPR 2022 • De Cheng, Tongliang Liu, Yixiong Ning, Nannan Wang, Bo Han, Gang Niu, Xinbo Gao, Masashi Sugiyama
In label-noise learning, estimating the transition matrix has attracted more and more attention as the matrix plays an important role in building statistically consistent classifiers.
no code implementations • 29 Mar 2022 • De Cheng, Yan Li, Dingwen Zhang, Nannan Wang, Xinbo Gao, Jiande Sun
To properly address this problem, we propose a novel density-variational learning framework to improve the robustness of the image dehzing model assisted by a variety of negative hazy images, to better deal with various complex hazy scenarios.
no code implementations • 29 Mar 2022 • De Cheng, Gerong Wang, Bo wang, Qiang Zhang, Jungong Han, Dingwen Zhang
This design makes the presented transformer model a hybrid of 1) top-down and bottom-up attention pathways and 2) dynamic and static routing pathways.
1 code implementation • CVPR 2022 • Peiliang Huang, Junwei Han, De Cheng, Dingwen Zhang
Zero-shot object detection aims at incorporating class semantic vectors to realize the detection of (both seen and) unseen classes given an unconstrained test image.
Ranked #2 on Zero-Shot Object Detection on PASCAL VOC'07
no code implementations • 29 Sep 2021 • De Cheng, Jingyu Zhou, Nannan Wang, Xinbo Gao
However, since person Re-Id is an open-set problem, the clustering based methods often leave out lots of outlier instances or group the instances into the wrong clusters, thus they can not make full use of the training samples as a whole.
1 code implementation • 27 Sep 2021 • Shizhou Zhang, De Cheng, Wenlong Luo, Yinghui Xing, Duo Long, Hao Li, Kai Niu, Guoqiang Liang, Yanning Zhang
Finding target persons in full scene images with a query of text description has important practical applications in intelligent video surveillance. However, different from the real-world scenarios where the bounding boxes are not available, existing text-based person retrieval methods mainly focus on the cross modal matching between the query text descriptions and the gallery of cropped pedestrian images.
1 code implementation • 22 Sep 2021 • Yan Li, De Cheng, Jiande Sun, Dingwen Zhang, Nannan Wang, Xinbo Gao
In this paper, we propose a single image dehazing method with an independent Detail Recovery Network (DRN), which considers capturing the details from the input image over a separate network and then integrates them into a coarse dehazed image.
no code implementations • ICCV 2021 • Xinpeng Ding, Nannan Wang, Shiwei Zhang, De Cheng, Xiaomeng Li, Ziyuan Huang, Mingqian Tang, Xinbo Gao
The contrastive objective aims to learn effective representations by contrastive learning, while the caption objective can train a powerful video encoder supervised by texts.
no code implementations • ICCV 2017 • Hehe Fan, Xiaojun Chang, De Cheng, Yi Yang, Dong Xu, Alexander G. Hauptmann
relevant) to the given event class, we formulate this task as a multi-instance learning (MIL) problem by taking each video as a bag and the video shots in each video as instances.
no code implementations • 25 Jul 2017 • De Cheng, Yihong Gong, Zhihui Li, Weiwei Shi, Alexander G. Hauptmann, Nanning Zheng
The proposed method can take full advantages of the structured distance relationships among these training samples, with the constructed complete graph.
no code implementations • CVPR 2016 • De Cheng, Yihong Gong, Sanping Zhou, Jinjun Wang, Nanning Zheng
Person re-identification across cameras remains a very challenging problem, especially when there are no overlapping fields of view between cameras.