no code implementations • 15 Mar 2024 • Yukun Li, Guansong Pang, Wei Suo, Chenchen Jing, Yuling Xi, Lingqiao Liu, Hao Chen, Guoqiang Liang, Peng Wang
Large pre-trained VLMs like CLIP have demonstrated superior zero-shot recognition ability, and a number of recent studies leverage this ability to mitigate catastrophic forgetting in CL, but they focus on closed-set CL in a single domain dataset.
no code implementations • 4 Mar 2024 • Lingyan Ran, YaLi Li, Guoqiang Liang, Yanning Zhang
Semantic segmentation is an important and popular research area in computer vision that focuses on classifying pixels in an image based on their semantics.
no code implementations • 7 Feb 2024 • Guoqiang Liang, Jiahao Hu, Qingyue Wang, Shizhou Zhang
Human de-occlusion, which aims to infer the appearance of invisible human parts from an occluded image, has great value in many human-related tasks, such as person re-id, and intention inference.
1 code implementation • 24 Aug 2023 • Shizhou Zhang, Qingchun Yang, De Cheng, Yinghui Xing, Guoqiang Liang, Peng Wang, Yanning Zhang
In this work, we construct a large-scale dataset for Ground-to-Aerial Person Search, named G2APS, which contains 31, 770 images of 260, 559 annotated bounding boxes for 2, 644 identities appearing in both of the UAVs and ground surveillance cameras.
no code implementations • 27 Jun 2023 • Yunfan Lu, Guoqiang Liang, Lin Wang
Although events possess high temporal resolution, beneficial for video frame interpolation (VFI), a hurdle in tackling this task is the lack of paired GS frames.
1 code implementation • CVPR 2023 • Qingsheng Wang, Lingqiao Liu, Chenchen Jing, Hao Chen, Guoqiang Liang, Peng Wang, Chunhua Shen
Compositional Zero-Shot Learning (CZSL) aims to train models to recognize novel compositional concepts based on learned concepts such as attribute-object combinations.
Ranked #1 on Compositional Zero-Shot Learning on MIT-States
1 code implementation • 24 May 2023 • Yunfan Lu, Guoqiang Liang, Lin Wang
Images captured by rolling shutter (RS) cameras under fast camera motion often contain obvious image distortions and blur, which can be modeled as a row-wise combination of a sequence of global shutter (GS) frames within the exposure time naturally, recovering high-frame-rate GS sharp frames from an RS blur image needs to simultaneously consider RS correction, deblur, and frame interpolation Taking this task is nontrivial, and to our knowledge, no feasible solutions exist by far.
no code implementations • 16 Apr 2023 • Ke Song, Quan Xia, Guoqiang Liang, Zhaojie Chen, Yanning Zhang
Instead, by mixing new and old features, old knowledge can be retained without increasing the computational complexity.
1 code implementation • 16 Feb 2023 • Guoqiang Liang, Zhaojie Chen, Zhaoqiang Chen, Shiyu Ji, Yanning Zhang
In all settings, the online class incremental learning (OCIL), where incoming samples from data stream can be used only once, is more challenging and can be encountered more frequently in real world.
1 code implementation • 1 Feb 2023 • Yinghui Xing, Song Wang, Shizhou Zhang, Guoqiang Liang, Xiuwei Zhang, Yanning Zhang
Most of the available multispectral pedestrian detectors are based on non-end-to-end detectors, while in this paper, we propose MultiSpectral pedestrian DEtection TRansformer (MS-DETR), an end-to-end multispectral pedestrian detector, which extends DETR into the field of multi-modal detection.
1 code implementation • 17 Aug 2022 • Yinghui Xing, Qirui Wu, De Cheng, Shizhou Zhang, Guoqiang Liang, Peng Wang, Yanning Zhang
To make the final image feature concentrate more on the target visual concept, a Class-Aware Visual Prompt Tuning (CAVPT) scheme is further proposed in our DPT, where the class-aware visual prompt is generated dynamically by performing the cross attention between text prompts features and image patch token embeddings to encode both the downstream task-related information and visual instance information.
1 code implementation • 27 Sep 2021 • Shizhou Zhang, De Cheng, Wenlong Luo, Yinghui Xing, Duo Long, Hao Li, Kai Niu, Guoqiang Liang, Yanning Zhang
Finding target persons in full scene images with a query of text description has important practical applications in intelligent video surveillance. However, different from the real-world scenarios where the bounding boxes are not available, existing text-based person retrieval methods mainly focus on the cross modal matching between the query text descriptions and the gallery of cropped pedestrian images.
no code implementations • 24 May 2021 • Guoqiang Liang, Yanbing Lv, Shucheng Li, Shizhou Zhang, Yanning Zhang
Specifically, the generator employs a fully convolutional sequence network to extract global representation of a video, and an attention-based network to output normalized importance scores.
Generative Adversarial Network Unsupervised Video Summarization
no code implementations • 29 Mar 2021 • Lei Tian, Guoqiang Liang, Peng Wang, Chunhua Shen
Because of the invisible human keypoints in images caused by illumination, occlusion and overlap, it is likely to produce unreasonable human pose prediction for most of the current human pose estimation methods.
no code implementations • 15 Sep 2020 • Guoqiang Liang, Yi Jiang, Haiyan Hou
In the last two decades, scholars have designed various types of bibliographic related indicators to identify breakthrough-class academic achievements.
no code implementations • 25 Oct 2019 • Shizhou Zhang, Yifei Yang, Peng Wang, Guoqiang Liang, Xiuwei Zhang, Yanning Zhang
The problem of cross-modality person re-identification has been receiving increasing attention recently, due to its practical significance.
Cross-Modality Person Re-identification Person Re-Identification