no code implementations • 8 Mar 2025 • Yubin Wang, Xinyang Jiang, De Cheng, Xiangqian Zhao, Zilong Wang, Dongsheng Li, Cairong Zhao
Visual prompt tuning offers significant advantages for adapting pre-trained visual foundation models to specific tasks.
no code implementations • 17 Feb 2025 • Chendong Wang, Anlan Zhang, Yifan Yang, Lili Qiu, Yuqing Yang, Xinyang Jiang, Feng Qian, Suman Banerjee
A natural approach to mitigate the bandwidth issue is to reduce the volumetric video's data rate by downsampling the content prior to transmission.
no code implementations • 12 Jan 2025 • Shan Jiang, Zhenhua Han, Haisheng Tan, Xinyang Jiang, Yifan Yang, Xiaoxi Zhang, Hongqiu Ni, Yuqing Yang, Xiang-Yang Li
To address this, we introduce River, a cloud gaming delivery framework designed based on the observation that video segment features in cloud gaming are typically repetitive and redundant.
2 code implementations • 27 Aug 2024 • Yubin Wang, Xinyang Jiang, De Cheng, Wenli Sun, Dongsheng Li, Cairong Zhao
Specifically, we introduce a relationship-guided attention module to capture pair-wise associations among entities and attributes for low-level prompt learning.
Ranked #1 on
Prompt Engineering
on ImageNet V2
no code implementations • 13 Aug 2024 • Yubin Wang, Xinyang Jiang, De Cheng, Dongsheng Li, Cairong Zhao
Video temporal grounding is an emerging topic aiming to identify specific clips within videos.
no code implementations • 30 May 2024 • Wenli Sun, Xinyang Jiang, Dongsheng Li, Cairong Zhao
Consequently, DiffPhysBA can generate realistic attributes as semantic-level triggers in the digital domain and provides higher physical ASR compared to the direct paste method by 25. 6% on the real-world test set.
no code implementations • 10 May 2024 • Hanchi Sun, Xiaohong Liu, Xinyang Jiang, Yifei Shen, Dongsheng Li, Xiongkuo Min, Guangtao Zhai
This paper focuses on the task of quality enhancement for compressed videos.
no code implementations • 1 Apr 2024 • Zilong Wang, Xufang Luo, Xinyang Jiang, Dongsheng Li, Lili Qiu
This study proposes a novel evaluation framework using large language models (LLMs) to compare radiology reports for assessment.
1 code implementation • 19 Mar 2024 • Yifei Shen, Xinyang Jiang, Yezhen Wang, Yifan Yang, Dongqi Han, Dongsheng Li
Adding additional control to pretrained diffusion models has become an increasingly popular research area, with extensive applications in computer vision, reinforcement learning, and AI for science.
no code implementations • CVPR 2024 • De Cheng, Zhipeng Xu, Xinyang Jiang, Nannan Wang, Dongsheng Li, Xinbo Gao
Although there is a growing focus on VFM-based domain prompt tuning for DG effectively learning prompts that disentangle invariant features across all domains remains a major challenge.
no code implementations • 21 Dec 2023 • Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet, Yunhao Ge
We introduce a solution that allows a pretrained T2I diffusion model to learn a set of soft prompts, enabling the generation of novel images by sampling prompts from the learned distribution.
2 code implementations • 11 Dec 2023 • Yubin Wang, Xinyang Jiang, De Cheng, Dongsheng Li, Cairong Zhao
To address this limitation and prioritize harnessing structured knowledge, this paper advocates for leveraging LLMs to build a graph for each description to model the entities and attributes describing the category, as well as their correlations.
Ranked #2 on
Prompt Engineering
on ImageNet V2
no code implementations • 24 Nov 2023 • Xiaoxuan He, Yifan Yang, Xinyang Jiang, Xufang Luo, Haoji Hu, Siyun Zhao, Dongsheng Li, Yuqing Yang, Lili Qiu
To overcome the aforementioned challenges, we propose an Unified Medical Image Pre-training framework, namely UniMedI, which utilizes diagnostic reports as common semantic space to create unified representations for diverse modalities of medical images (especially for 2D and 3D images).
no code implementations • 22 Nov 2023 • Zefan Qu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Cairong Zhao
To the best of our knowledge, we are the first to exploit the LUT structure to extract temporal information in video tasks.
1 code implementation • ICCV 2023 • Guangyang Wu, Xiaohong Liu, Kunming Luo, Xi Liu, Qingqing Zheng, Shuaicheng Liu, Xinyang Jiang, Guangtao Zhai, Wenyi Wang
To train and evaluate the proposed AccFlow, we have constructed a large-scale high-quality dataset named CVO, which provides ground-truth optical flow labels between adjacent and distant frames.
1 code implementation • IEEE Transactions on Image Processing 2023 • Cairong Zhao, Zefan Qu, Xinyang Jiang, Yuanpeng Tu, Xiang Bai
To address these challenges, we propose a novel Content-Adaptive Auto-Occlusion Network (CAAO), that is able to dynamically select the proper occlusion region of an image based on its content and the current training status.
no code implementations • 1 Jun 2023 • Ruibin Li, Qihua Zhou, Song Guo, Jie Zhang, Jingcai Guo, Xinyang Jiang, Yifei Shen, Zhenhua Han
Diffusion-based Generative Models (DGMs) have achieved unparalleled performance in synthesizing high-quality visual content, opening up the opportunity to improve image super-resolution (SR) tasks.
1 code implementation • The Eleventh International Conference on Learning Representations 2023 • Shuguang Dou, Xinyang Jiang, Cai Rong Zhao, Dongsheng Li
The energy consumption for training deep learning models is increasing at an alarming rate due to the growth of training data and model scale, resulting in a negative impact on carbon neutrality.
1 code implementation • International Conference on Learning Representations 2023 • Ziyue Li, Kan Ren, Xinyang Jiang, Yifei Shen, Haipeng Zhang, Dongsheng Li
Moreover, our method is highly efficient and achieves more than 1000 times training speedup compared to the conventional DG methods with fine-tuning a pretrained model.
Ranked #1 on
Domain Generalization
on PACS
no code implementations • 1 Mar 2023 • Guanghao Yin, Zefan Qu, Xinyang Jiang, Shan Jiang, Zhenhua Han, Ningxin Zheng, Xiaohong Liu, Huan Yang, Yuqing Yang, Dongsheng Li, Lili Qiu
To facilitate the research on this problem, a new benchmark dataset named LDV-WebRTC is constructed based on a real-world online streaming system.
no code implementations • 27 Feb 2023 • Jiaqi Gao, Xinyang Jiang, Yuqing Yang, Dongsheng Li, Lili Qiu
Correspondingly, we propose a Dual Stream deep model for Stereotypical Behaviours Detection, DS-SBD, based on the temporal trajectory of human poses and the repetition patterns of human actions.
no code implementations • 29 Jan 2023 • Ziyue Li, Kan Ren, Yifan Yang, Xinyang Jiang, Yuqing Yang, Dongsheng Li
Ensemble methods can deliver surprising performance gains but also bring significantly higher computational costs, e. g., can be up to 2048X in large-scale ensemble tasks.
1 code implementation • IEEE Transactions on Image Processing 2022 • Shuguang Dou, Cairong Zhao, Xinyang Jiang, Shanshan Zhang, Wei-Shi Zheng, WangMeng Zuo
Most supervised methods propose to train an extra human parsing model aside from the ReID model with cross-domain human parts annotation, suffering from expensive annotation cost and domain gap; Unsupervised methods integrate a feature clustering-based human parsing process into the ReID model, but lacking supervision signals brings less satisfactory segmentation results.
Ranked #9 on
Person Re-Identification
on Occluded-DukeMTMC
1 code implementation • ICCV 2023 • Yifan Yang, Weiquan Huang, Yixuan Wei, Houwen Peng, Xinyang Jiang, Huiqiang Jiang, Fangyun Wei, Yin Wang, Han Hu, Lili Qiu, Yuqing Yang
To address this issue, we propose an attentive token removal approach for CLIP training, which retains tokens with a high semantic correlation to the text description.
1 code implementation • 8 Dec 2022 • Cairong Zhao, Yubin Wang, Xinyang Jiang, Yifei Shen, Kaitao Song, Dongsheng Li, Duoqian Miao
Prompt learning is one of the most effective and trending ways to adapt powerful vision-language foundation models like CLIP to downstream datasets by tuning learnable prompt vectors with very few samples.
Ranked #5 on
Prompt Engineering
on Food-101
1 code implementation • 20 Nov 2022 • Wenli Sun, Xinyang Jiang, Shuguang Dou, Dongsheng Li, Duoqian Miao, Cheng Deng, Cairong Zhao
Instead of learning fixed triggers for the target classes from the training set, DT-IBA can dynamically generate new triggers for any unknown identities.
no code implementations • 4 Aug 2022 • Jun Xiao, Xinyang Jiang, Ningxin Zheng, Huan Yang, Yifan Yang, Yuqing Yang, Dongsheng Li, Kin-Man Lam
Then, our proposed CKBG method enhances this lightweight base model by bypassing the original network with ``kernel grafts'', which are extra convolutional kernels containing the prior knowledge of external pretrained image SR models.
no code implementations • 15 Jul 2022 • Shuguang Dou, Xinyang Jiang, Qingsong Zhao, Dongsheng Li, Cairong Zhao
In this paper, we aim to develop a technique that can achieve a good trade-off between privacy protection and data usability for person ReID.
no code implementations • CVPR 2022 • Ruoxi Shi, Xinyang Jiang, Caihua Shan, Yansen Wang, Dongsheng Li
Instead of looking at one format, it is a good solution to utilize the formats of VG and RG together to avoid these shortcomings.
no code implementations • CVPR 2022 • Chenqian Yan, Yuge Zhang, Quanlu Zhang, Yaming Yang, Xinyang Jiang, Yuqing Yang, Baoyuan Wang
Thanks to HyperFD, each local task (client) is able to effectively leverage the learning "experience" of previous tasks without uploading raw images to the platform; meanwhile, the meta-feature extractor is continuously learned to better trade off the bias and variance.
no code implementations • 9 Mar 2022 • Ziyue Li, Kan Ren, Xinyang Jiang, Bo Li, Haipeng Zhang, Dongsheng Li
Fine-tuning pretrained models is a common practice in domain generalization (DG) tasks.
Ranked #10 on
Domain Generalization
on TerraIncognita
2 code implementations • NeurIPS 2021 • Xinyang Jiang, Lu Liu, Caihua Shan, Yifei Shen, Xuanyi Dong, Dongsheng Li
In this paper, we consider a different data format for images: vector graphics.
no code implementations • 29 Sep 2021 • Ziyue Li, Kan Ren, Xinyang Jiang, Mingzhe Han, Haipeng Zhang, Dongsheng Li
Real-world data is often generated by some complex distribution, which can be approximated by a composition of multiple simpler distributions.
no code implementations • 30 Aug 2021 • Bo Li, Xinyang Jiang, Donglin Bai, Yuge Zhang, Ningxin Zheng, Xuanyi Dong, Lu Liu, Yuqing Yang, Dongsheng Li
The energy consumption of deep learning models is increasing at a breathtaking rate, which raises concerns due to potential negative effects on carbon neutrality in the context of global warming and climate change.
1 code implementation • CVPR 2021 • Jiaxing Chen, Xinyang Jiang, Fudong Wang, Jun Zhang, Feng Zheng, Xing Sun, Wei-Shi Zheng
In this paper, rather than relying on texture based information, we propose to improve the robustness of person ReID against clothing texture by exploiting the information of a person's 3D shape.
Ranked #4 on
Person Re-Identification
on VC-Clothes
1 code implementation • ICCV 2021 • Guanyu Cai, Jun Zhang, Xinyang Jiang, Yifei Gong, Lianghua He, Fufu Yu, Pai Peng, Xiaowei Guo, Feiyue Huang, Xing Sun
However, the performance of existing methods suffers in real life since the user is likely to provide an incomplete description of an image, which often leads to results filled with false positives that fit the incomplete description.
2 code implementations • 8 Jan 2021 • Chenyang Gao, Guanyu Cai, Xinyang Jiang, Feng Zheng, Jun Zhang, Yifei Gong, Pai Peng, Xiaowei Guo, Xing Sun
Secondly, a BERT with locality-constrained attention is proposed to obtain representations of descriptions at different scales.
Ranked #17 on
Text based Person Retrieval
on CUHK-PEDES
no code implementations • ICCV 2021 • Jinrui Yang, Jiawei Zhang, Fufu Yu, Xinyang Jiang, Mengdan Zhang, Xing Sun, Ying-Cong Chen, Wei-Shi Zheng
Several mainstream methods utilize extra cues (e. g., human pose information) to distinguish human parts from obstacles to alleviate the occlusion problem.
Ranked #24 on
Person Re-Identification
on Occluded-DukeMTMC
1 code implementation • 10 Dec 2020 • Enwei Zhang, Xinyang Jiang, Hao Cheng, AnCong Wu, Fufu Yu, Ke Li, Xiaowei Guo, Feng Zheng, Wei-Shi Zheng, Xing Sun
Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss of the model decreases on selected training batch, with no regards to the performance on samples outside the batch.
3 code implementations • 12 Sep 2020 • Jinpeng Wang, Yuting Gao, Ke Li, Jianguo Hu, Xinyang Jiang, Xiaowei Guo, Rongrong Ji, Xing Sun
Specifically, we construct a positive clip and a negative clip for each video.
1 code implementation • 11 Sep 2020 • Fufu Yu, Xinyang Jiang, Yifei Gong, Shizhen Zhao, Xiaowei Guo, Wei-Shi Zheng, Feng Zheng, Xing Sun
Secondly, the Conditional Feature Embedding requires the overall feature of a query image to be dynamically adjusted based on the gallery image it matches, while most of the existing methods ignore the reference images.
Ranked #1 on
Person Re-Identification
on CUHK03-C
1 code implementation • ECCV 2020 • Shizhen Zhao, Changxin Gao, Jun Zhang, Hao Cheng, Chuchu Han, Xinyang Jiang, Xiaowei Guo, Wei-Shi Zheng, Nong Sang, Xing Sun
In the conventional person Re-ID setting, it is widely assumed that cropped person images are for each individual.
1 code implementation • 3 Dec 2019 • Zhihui Zhu, Xinyang Jiang, Feng Zheng, Xiaowei Guo, Feiyue Huang, Wei-Shi Zheng, Xing Sun
Instead of one subspace for each viewpoint, our method projects the feature from different viewpoints into a unified hypersphere and effectively models the feature distribution on both the identity-level and the viewpoint-level.
Ranked #7 on
Person Re-Identification
on DukeMTMC-reID
(using extra training data)
2 code implementations • 28 Nov 2019 • Xinyang Jiang, Yifei Gong, Xiaowei Guo, Qize Yang, Feiyue Huang, Wei-Shi Zheng, Feng Zheng, Xing Sun
Recently, the research interest of person re-identification (ReID) has gradually turned to video-based methods, which acquire a person representation by aggregating frame features of an entire video.
1 code implementation • CVPR 2019 • Feng Zheng, Cheng Deng, Xing Sun, Xinyang Jiang, Xiaowei Guo, Zongqiao Yu, Feiyue Huang, Rongrong Ji
Most existing Re-IDentification (Re-ID) methods are highly dependent on precise bounding boxes that enable images to be aligned with each other.
Ranked #2 on
Person Re-Identification
on CUHK03-C